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Preface 


About the Book 


This textbook is a treatment of the structure of abstract spaces, in particular linear, 
topological, metric, and normed spaces, as well as topological groups, in a rigorous 
and reader-friendly fashion. The assumed background knowledge on the part of the 
reader is modest, limited to basic concepts of finite dimensional linear spaces and 
elementary analysis. The book’s aim is to serve as an introduction toward the theory 
of Hilbert spaces and the theory of operators. 

The formalism of Hilbert spaces is fundamental to Physics, and in particular to 
Quantum Mechanics, requiring a certain amount of fluency with the techniques of 
linear algebra, metric space theory, and topology. Typical introductory level books 
devoted to Hilbert spaces assume a significant level of familiarity with the necessary 
background material and consequently present only a brief review of it. A reader 
who finds the overview insufficient is forced to consult other sources to fill the gap. 
Assuming only a rudimentary understanding of real analysis and linear algebra, this 
book offers an introduction to the mathematical prerequisites of Hilbert space 
theory in a single self-contained source. The final chapter is devoted to topological 
groups, offering a glimpse toward more advanced theory. The text is suitable for 
advanced undergraduate or introductory graduate courses for both Physics and 
Mathematics students. 


The Structure of the Book 

The book consists of six chapters, with an additional chapter of solved problems 
arranged by topic. Each chapter is composed of five sections, with each section 
accompanied by a set of exercises (with the exception of Chap. 6, which is shorter. 


vii 


Preface 


viii 

and thus contains a single batch of exercises located at the end of the chapter). The 
total of 210 exercises and 50 solved problems comprise an integral part of the book 
designed to assist the reader, challenge her, and hone her intuition. 

Chapter 1 contains a general introduction to real analysis and, in particular, to 
each of the subjects presented in the chapters that follow. Chapter 1 also contains a 
Preliminaries section, intended to quickly orient the reader as to the notation and 
concepts used throughout the book, starting with sets and ending with an axiomatic 
presentation of the real numbers. 

Chapter 2 is devoted to linear spaces. At the advanced undergraduate level the 
reader is already familiar with at least some aspects of linear spaces, primarily finite 
dimensional ones. The chapter does not rely on any previous knowledge though, 
and is in that sense self-contained. However, the material is somewhat advanced 
since the focus is infinite dimensional linear spaces, which are technically relatively 
demanding. 

Chapter 3 is an introduction to topology, a subject considered to be at a rather 
high level of abstraction. The main aim of the chapter is to familiarize the reader 
with the fundamentals of the theory, and in particular those that are most directly 
relevant for real analysis and Hilbert spaces. Care is taken to finding a reasonable 
balance between the study of extreme topology, i.e., spaces or phenomena that one 
may consider pathological but that hone the topological intuition, and mundane 
topology, i.e., those spaces or phenomena one is most likely to find in nature, but 
which may obscure the true nature of topology. 

Chapter 4 is a study of metric spaces. Once the necessary fundamentals are 
covered, the main focus is complete metric spaces. In particular, the Banach Fixed- 
Point Theorem and Baire’s Theorem are proved and completions are discussed, 
topics which are indispensable for Hilbert space theory. 

Chapter 5 introduces and studies normed spaces and Banach spaces. Starting 
with semi-normed spaces the chapter establishes the fundamentals, and goes on to 
introduce Banach spaces, treating the Open Mapping Theorem, the Hahn-Banach 
Theorem, the Closed Graph Theorem, and, alluding to Hilbert space theory proper, 
the Riesz Representation Theorem. 

Chapter 6 is a short introduction to topological groups, emphasizing their rela- 
tion to Banach spaces. The chapter does not assume any knowledge of group 
theory, and thus, to remain self-contained, it presents all relevant group-theoretic 
notions. The chapter ends with a treatment of uniform spaces and a hint of their 
usefulness in the general theory. 


A Word About the Intended Audience 

The book is aimed at the advanced undergraduate or beginning postgraduate level, 
with the general prerequisite of sufficient mathematical maturity as expected at that 
level of studies. The book should be of interest to the student knowing nothing of 
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Hilbert space theory who wishes to master its prerequisites. The book should also 
be useful to the reader who is already familiar with some aspects of Hilbert space 
theory, linear spaces, topology, or metric spaces, as the book contains all the 
relevant definitions and pivotal theorems in each of the subjects it covers. More- 
over, the book contains a chapter on topological groups and a treatment of uniform 
spaces, a topic usually considered at a more advanced level. 


A Word About the Authors 

The authors of the book, a physicist and a mathematician, by writing the entire book 
together, and through many arguments about notation and style, hope that this clash 
between the desires of a physicist to quickly yet intelligibly get to the point and the 
insistence of a mathematician on rigor and conciseness did not leave the pages of 
this book tainted with blood, but rather that it resulted in a welcoming introduction 
for both physicists and mathematicians interested in Hilbert space theory. 

Prof. Carlo Alabiso obtained his Degree in Physics at Milan University, Italy, 
and then taught for more than 40 years at Parma University, Parma, Italy (with a 
period spent as a research fellow at the Stanford Linear Accelerator Center and at 
Cern, Geneva). His teaching encompassed topics in Quantum Mechanics, special 
relativity, field theory, elementary particle physics, mathematical physics, and 
functional analysis. His research fields include mathematical physics (Pade ap- 
proximants), elementary particle physics (symmetries and quark models), and sta- 
tistical physics (ergodic problems), and he has published articles in a wide range of 
national and international journals as well as the previous Springer book (with 
Alessandro Chiesa), Problemi di Meccanica Quantistica non Relativistica. 

Dr. Ittay Weiss completed his B.Sc. and M.Sc. studies in Mathematics at the 
Hebrew University of Jerusalem and he obtained his Ph.D. in mathematics from 
Universiteit Utrecht in The Netherlands. He spent an additional 3 years in Utrecht 
as an assistant professor of mathematics where he taught mathematics courses 
across the entire undergraduate spectrum both at Utrecht University and at the 
affiliated University College Utrecht. He is currently a mathematics lecturer at the 
University of the South Pacific. His research interests lie in the fields of algebraic 
topology and operad theory, as well as the mathematical foundations of analysis 
and generalizations of metric spaces. 
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Chapter 1 

Introduction and Preliminaries 


Abstract This short chapter, consisting of two sections, contains a brief overview 
of the mathematical foundations of Hilbert spaces and analysis, a discussion of the 
real number system, and a detailed account of the preliminaries required for reading 
the rest of the book. 

Keywords Set • Equivalence relation ■ Function • Zorn’s Lemma • Relation • Car- 
dinality • Countable set • Ordered set • Real numbers axioms • Set operations 


1.1 Hilbert Space Theory — A Quick Overview 


Hilbert space theory is the mathematical formalism for non-relativistic Quantum 
Mechanics. A Hilbert space is a linear space together with an inner product that 
endows the space with the structure of a complete metric space. Hilbert space theory 
is thus a fusion of algebra, topology, and geometry. 

The reader is already familiar with the intermingling of algebra and geometry, 
namely in the linear space M". Elements in M" can be thought of as points in n- 
dimensional space but also as vectors. Typically, points have coordinates and vectors 
can be added and scaled. Moreover, in the presence of the standard inner product, 
given by (x, y) = X*=t x k}'k, the length of a vector is given by the norm 

IMI = V(x,x) 


and the angle between vectors can be computed by means of the formula 


0 = arccos 


{x, y) 

|Jc||||yir 


At this level, the interaction between the algebra and the geometry is quite smooth. 
Things change though as soon as one considers infinite dimensional linear spaces, 
and this is also where topology comes into the picture. For finite dimensional lin- 
ear spaces, bases (and thus coordinates) are readily available, all linear operators 
are continuous, convergence of operators has a single meaning, any linear space is 
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naturally isomorphic to its double dual, and the closed unit ball is compact. These 
conveniences do not exist in the infinite dimensional case. While bases do exist, the 
proof of their existence is non-constructive and more often than not no basis can 
be given explicitly. Consequently, techniques that rely on coordinates and matrices 
are generally unsuitable. Linear operators need not be continuous, and in fact many 
linear operators of interest are not continuous. The space consisting of all linear 
operators between two linear spaces carries two different topologies, and thus two 
different notions of convergence. The correct notion of the dual space is that of all 
continuous linear operators into the ground field and even then the original space 
only embeds in its double dual. Lastly, the closed unit ball is not compact. 

In a sense, the aim of Hilbert space theory is to develop adequate machinery to 
be able to reason about, and in, infinite dimensional linear spaces subject to these 
inherent difficulties and topological subtleties. The mathematics background required 
for the study of Hilbert spaces thus includes linear algebra, topology, the theory of 
metric spaces, and the theory of normed spaces. The theory of topological groups, 
a fusion between group theory and topology, arises naturally as well. Before we 
proceed to elaborate on each of these topics we visit the most fundamental space of 
them all, that of the real numbers. 


1.1.1 The Real Numbers — Where it All Begins 

Most fundamental to human observation of the outside world are the real numbers. 
The outcome of a measurement is almost always assumed to be a real number (at 
least in some ideal sense). That this idealization is deeply ingrained in the scientific 
lore is witnessed by the very name of the real numbers; among all numbers, these 
are the real ones! Whatever the reason may be for science’s strong preference for 
the real numbers is a debate for philosophers. We merely observe that, whether the 
use of real numbers is justified or not, the success of the real numbers in science is 
unquestionable. 

Mathematically though, the real numbers pose several non-trivial challenges. One 
of those challenges is the very definition of the real numbers, or, in other words, 
answering the question: what is a real number? For the ancient Greeks, roughly 
speaking until the Pythagoreans, the real numbers were taken for granted as being 
the same as what we would today call the rationals. That was the commonly held 
belief until the discovery that ~Jl is an irrational number (the precise circumstances 
of that discovery are unclear). The construction of precise models for the reals had 
to wait numerous centuries, and so did the discovery of transcendental numbers and 
the understanding of just how many real numbers there are. 
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1.1. 1.1 Constructing the Real Numbers 

The reader is likely to have her own personal idea(lization) regarding what the real 
numbers really are. Any precise interpretation or elucidation of the true entities 
which are the real numbers belongs to philosophy. Choosing to stay firmly afoot in 
mathematics, we present now one of many precise constructions of a model of the 
real numbers. As with the construction of any model, one can not construct something 
from nothing. We thus assume the reader accepts the existence of (a model of) the 
rational numbers. In other words, we pretend to agree on what the rational numbers 
are, but to have no knowledge of the real number system. We will give the precise 
construction but, to remain in the spirit of an introduction, we avoid the details of 
the proofs. 

To motivate the following construction of the real numbers from the rational 
numbers, recall that the rational numbers are dense in the real numbers, meaning that 
between any two real numbers there exists a rational number. This simple observation 
gives rise to the crucial fact that every real number is the limit of a sequence of rational 
numbers. Thus, the convergent sequences of rational numbers give us access to all of 
the real numbers. Since many different sequences of rational numbers may converge 
to the same real number, the set of all convergent sequences of rational numbers is 
too wasteful to be taken as the definition of the real numbers. One needs to introduce 
an equivalence relation on it, one that identifies two sequences of rational numbers 
if they converge to the same real number. 

To conclude the discussion so far, if S denotes the set of all convergent sequences 
of rational numbers, then we expect to be able to identify an equivalence relation ~ 
on S such that S/~ forms a model of the real numbers. However, to even take the 
first step in this plan without committing the sin of a circular argument, one must 
first be able to identify the convergent sequences among all sequences of rational 
numbers without any a-priori knowledge of the real numbers. That task is achieved 
by appealing to the familiar notion of a Cauchy sequence, a condition which, for 
sequences of real numbers, is well-known to be equivalent to convergence. The key 
observation is that the Cauchy condition for sequences of rational numbers can be 
stated without mentioning real numbers at all. In this way the convergent sequences 
of rational numbers are carved from the set of all sequences of rationals by means 
of an inherently rational criterion. 

The construction we give is due to Georg Cantor, the creator of the theory of sets. 
On the set Q of rational numbers, which we assume is endowed with the familiar 
notions of addition and multiplication, and thus also with the notions of subtraction 
and division, consider the function d : Q x Q —> Q given by d(x, y) = \x — y|. 
We now define a Cauchy sequence to be a sequence (x„ ) of rational numbers which 
satisfies the condition that for every rational number e > 0 there exists N e N such 
that d(x n , x m ) < e for all n, m > N . Let now S be the set of all Cauchy sequences 
and define the relation ~ on S as follows. Given Cauchy sequences (x n ) , (}■'„ ) e S, 
declare that (x n ) ~ (y n ) if d(x n , y„ ) — > 0. Note that convergence to 0 (or any other 
rational number) of a sequence of rational numbers can be stated without assuming 
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the existence of real numbers. Indeed, the meaning of (x n ) ~ (y„ ) is that for every 
rational number e > 0 there exists an N e N such that d (x n ,y n )<s for all n > N . 

It is a straightforward matter to show that ~ is an equivalence relation on S. 
We now define the set of real numbers to be R = S/~, namely the quotient set 
determined by the equivalence relation. One immediate thing to notice is that R 
contains a copy of Q. Indeed, for any rational number q e Q consider the constant 
sequence x q = (x n = q). It is clearly a Cauchy sequence, and thus x q e S, and 
its equivalence class [x^] is thus, per definition, a real number. Further, if q' ^ q is 
another rational, then [x q \ ^ [xy] since the respective constant sequences are not 
equivalent, and thus not identified in the quotient set. In other words, the function 
q !->■ [xq] is an injection from Q to R. 

The familiar algebraic structure of Q carries over to R as follows. Given real 
numbers x, y, represented by Cauchy sequences (x n ) and (y n ) respectively, the sum 
x + y is defined to be represented by (x n + v«) and the product xy to be repre- 
sented by (x n y n ). Of course, one needs to check these are well-defined notions, i.e., 
that the proposed sequences are Cauchy and that the resulting sum and product are 
independent of the choice of representatives. This verification, especially for the 
sum, is straightforward. Once that is done, the verification of the familiar algebraic 
properties of addition and multiplication is easily performed, proving the R is a field. 
Similarly, the order structure of Q carries over to R by defining, for real numbers x 
and y as above, that x < y if either x — y or if x n < v„ for all but finitely many n. It 
is quite straightforward to verify that R is then an ordered field. Finally, and by far 
least trivially, R satisfies the least upper bound property, proving that it is Dedekind 
complete. 

The embedding of Q in R described above is easily seen to respect the order and 
algebraic structure of Q. Therefore the field Q of rational numbers we started with 
can be identified with its isomorphic copy in R, namely the image of the embedding 
Q — > R, and one then considers R as an extension of Q. 


1.1. 1.2 Transcendentals and Uncountablity 

The divide of the real numbers into rational and irrational numbers represents the 
first fundamental indication of the complexity of the real numbers. Together with 
the realization that %/2 is an irrational number comes the necessity, if only out of 
curiosity, of asking about the nature of other real numbers, for instance n and e. 
The number e was introduced by Jacob Bernoulli in 1683 but it was only some 
50 years later that Leonard Euler established its irrationality. In sharp contrast, the 
number 7t, known since antiquity, was only shown to be irrational in 1761 by Johann 
Heinrich Lambert. If this is not indication enough that the real number system holds 
more secrets than one would superficially expect, the fact that to date it is unknown 
whether or not n + e is irrational should remove any doubt. 

Some real numbers, while irrational, may still be nearly rational, in the following 
sense. Consider a rational number r = p/q. Clearly, r satisfies the equation 
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qt — p — 0, 

and thus r is a root of a polynomial with integer coefficients. More generally, any 
real number which is a root of a polynomial with integer coefficients is called an 
algebraic number. For instance, is algebraic since it is a root of the polynomial 
t 2 — 2. A real number that is not algebraic is called transcendental. The existence 
of transcendental numbers was not established until 1844 by Joseph Liouville who, 
shortly afterwards, constructed the first explicit example of a transcendental number 
(albeit a somewhat artificial one). More prominent members of the transcendental 
class are the number e, proved transcendental by Charles Hermite in 1873, and the 
number re , proved transcendental by Ferdinand von Lindemann in 1 882. While much 
is known about transcendental numbers, it is still unknown if jt + e is algebraic or not. 

In 1884 Georg Cantor, following shortly after Liouville ’s proof of the existence 
of transcendental numbers, proved not only that transcendental numbers exist but 
also that in some sense the vast majority of all real numbers are transcendental. 
The technique employed by Cantor was that of transfinite counting. With his notion 
of cardinality of sets, Cantor counted how many algebraic numbers there are and 
how many real numbers there are, and demonstrated that there are strictly more real 
numbers than algebraic ones. The inevitable conclusion that transcendental numbers 
exist, a conclusion that was met with some resistance, forces us to face the following 
equally inevitable truth. Similarly to Cantor’s counting of the algebraic numbers one 
can also count all sentences in any given natural language, for instance English. As 
it turns out, the cardinality of all potential English descriptions of real numbers is the 
same as the cardinality of the algebraic numbers, and thus is strictly smaller than the 
cardinality of all real numbers. We must now conclude that there exist real numbers 
that can not ever, even potentially, be described in any way at all. 

We now end this short historical excursion and turn to quickly address the main 
topics covered in this book. 


1.1.2 Linear Spaces 

A linear space, also known as a vector space, embodies what is perhaps the simplest 
notion of a mathematical space. When thought of as modeling actual physical space, 
a linear space appears to be the same in all directions and is completely devoid of 
any curvature. Linear spaces, and the closely related notion of linear transformation 
or linear operator, are fundamental objects. For instance, the derivative of a function 
at a point is best understood as a linear operator on tangent spaces, particularly for 
functions of several variables. Differentiable manifolds are spaces which may be 
very complicated but, at each point, they are locally linear. It is perhaps somewhat of 
an exaggeration, but it is famously held to be true that if a problem can be linearized, 
then it is as good as solved. 
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1.1.3 Topological Spaces 

Topology is easy to define but hard to explain. The ideas that led to the development 
of topology were lurking beneath the surface for some time and it is difficult to 
pinpoint the exact time in history when topology was born. What is clear is that 
after its birth the advancement of topology in the first half of the 20th century was 
rapid. The unifying power of topology is immense and its explanatory ability is a 
powerful one. For instance, familiar theorems of first year real analysis, such as the 
existence of maxima and minima for continuous functions on a closed interval, or 
the uniform continuity of a continuous function on a closed interval, may make the 
closed interval appear to have special significance. Topology is able to clarify the 
situation, identifying a particular topological property of the closed interval, namely 
that it is compact, as the key ingredient enabling the proof to carry over into much 
more general situations. 


1.1.4 Metric Spaces 

Metric spaces appeared in 1906 in the Ph.D dissertation of Maurice Rene Frechet. 
A metric space is a set where one can measure distances between points, and in 
a strong sense a metric space carries much more geometric information than can 
be described by a topological space. With stronger axioms come stronger theorems 
but also less examples. However, the axioms of a metric space allow for a vast 
and varied array of examples and the theorems one can prove are very strong. In 
particular, complete metric spaces, i.e., those that, intuitively, have no holes, admit 
two very strong theorems. One is the Banach Fixed-Point Theorem and the second is 
Baire’s Theorem. The former can be used to solve, among other things, differential 
equations, while the latter has deep consequences to the structure of complete metric 
spaces and to continuous functions. 


1.1.5 Normed Spaces and Banach Spaces 

A very basic attribute one can associate with a vector is its norm, i.e., its length. The 
abstract formalism is given by the axioms of a normed space. The presence of a norm 
allows one to define the distance between any two vectors, by the so-called induced 
metric, and one obtains a metric space, as well as the topology it induces. Thus any 
normed space immediately incorporates both algebra and geometry. A Banach space 
is a normed space which, as a metric space, is complete. The interaction between 
the algebra and the geometry is then particularly powerful, allowing for very strong 
results. 


1 . 1 Hilbert Space Theory — A Quick Overview 
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1.1.6 Topological Groups 

Banach spaces are a fusion of algebra and geometry. Similarly, topological groups are 
a fusion between algebra and topology. Unlike a Banach space, the algebraic structure 
is that of a group while the geometry is reduced from a metric to a topology. Thus, a 
topological group is a much weaker structure than a Banach space, yet the interaction 
between the algebra and the topology still gives rise to a very rich and interesting 
theory. 


1.2 Preliminaries 

The aim of this section is to recount some of the fundamental notions of set theory, 
the common language for rigorous mathematical discussions. It serves to quickly 
orient the reader with respect to the notation used throughout the book as well as 
to present fundamental results on cardinals and demonstrate the technique of Zorn’s 
Lemma. This section is designed to be skimmed through and only consulted for a 
more detailed reading if needed further on. For that reason, and unlike the rest of the 
book, the style of presentation in this section is rather condensed. 


1.2.1 Sets 

The notion of a set is taken to be a primitive notion, left undefined. In an axiomatic 
approach to sets one lists certain well formulated and carefully chosen axioms, while 
in a naive approach one relies on a common understanding of what sets are, avoiding 
the technical difficulties of precise definitions at the cost of some rigor. We adopt the 
naive approach and thus for us a set is a collection of elements, where no repetitions 
are allowed and no ordering of the elements of the set is assumed. 


1.2.2 Common Sets 

Among the sets we will encounter are the sets N = {1, 2, 3, . . .} of natural numbers, 
Z = {. . . , —3, —2, —1, 0, 1, 2, 3, . . .} of integers, Q = {p/q | p,q e Z,q ^ 0} 
of rational numbers, M of real numbers, and C = {x + iy x, y e M| of complex 
numbers. The notation illustrates two commonly occurring ways for introducing sets, 
the informal . . . which is meant to indicate that the reader knows what the author 
had in mind, and the precise form {x e X \ P (x ) } , to be interpreted as the set of all 
those elements x in X satisfying property P. The empty set is denoted by 0, and it 
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simply has no elements in it. A set {x}, consisting of just a single element, is called 
a singleton set. 


1.2.3 Relations Between Sets 

The notation x e A was already used above to indicate that x is a member of the set 
A. To indicate that x is not a member of X we write x f X. Two sets X and Y may 
have the property that any element x e X is also a member of Y, a situation denoted 
by X C Y, meaning that X is contained in Y, X is a subset of Y, Y contains X, and 
that Y is a superset of A. If moreover X ^ T, then X is said to be a proper subset 
of Y, denoted by A C Y. In particular, Ac Y means that every element x e A 
satisfies x e Y and that for at least one element y e F it holds that y f A. An often 
useful observation is that two sets A and Y are equal, denoted by A = T, if, and 
only if, A C Y and K A. 


1.2.4 Families of Sets; Union and Intersection 

Any set I may serve as an indexing set for a set of sets. In that case it is customary to 
speak of a family of sets, a collection of sets, or an indexed set. For instance, consider 
I = [0, 1] = {t e R | 0 < t < 1}, and for each tel let A r = (0, t) = {x e M | 
0 < x < t}. Then { A ; }, g / is a family of sets indexed by I. For any such indexed 
collection, its union is the set 


u* 

iel 

which is the set of all those elements x that belong to A, for some i e I (that may 
depend on x). Moreover, if A,- fl A j = 0 for all i, j e I with i f j , then the 
collection is said to be pairwise disjoint and the union is then said to be a disjoint 
union. The intersection of the collection is the set 

n* 

iel 

which is the set of all those elements x that belong to A, for all iel. For the indexed 
collection given above one may verify that 

U X-t = (0, 1) 

tel 
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and that 


f>=0- 


When the indexing set is finite, one typically writes 


X\ U X2 U ■ • • U X n 


and 


X\ n z 2 n • • • n x n 


for the union and intersection, respectively. 

1.2.5 Set Difference, Complementation, and De Morgan’s Laws 

The set difference X — Y is the set {x e X \ x Y}. In the presence of a universal 
set or a set of discourse, that is a set U which is the relevant ambient set in a given 
context, the complement of a set X C U is the set X c — {y e U \ y X}. In other 
words, X c = U — X. Needless to say, the notation X 1 pre-supposes knowledge of U, 
and different choices of U yield different complements. In this context we mention 
that the set difference X — Y is also known as the relative complement of X in Y . 

Among the many properties of sets and the operations of union, intersection, and 
complementation we only mention De Morgan’s laws: given any indexed collection 
{Xj}, e j of subsets of a set X, the equalities 



and 


(R *»-> c = U 


iel iel 

both hold. Here the complements are relative to the set X. 


1.2.6 Finite Cartesian Products 


Given m sets (m > 1), X \, . . . , X m , their cartesian product is the set 


X\ x • ■ ■ x X m — {(xi, . . . , x m ) | Xk e Xk, for all 1 < k < m}. 
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In other words, the cartesian product of m sets X \ , . . . , X m (in that order) is the set 
consisting of all /n-tuples where the A:-th component (for 1 < k < m) is taken from 
the k-th set. In case Xk — X for all 1 < k < m, the cartesian product is denoted by 
X m and is called the m-fold cartesian product of X with itself. 


1.2.7 Functions 

Given any two sets X and Y, a function or a mapping , denoted by / : X — > For 

f 

X >■ Y is, informally, a rule that associates with each element x e X an element 

f(x) e Y. More formally, a function / : X — > Y is a subset / C X x Y satisfying 
the property that for all x e X and y\ , yg e Y , if (x, yi ) e f and (x, yi ) e /, 
then yi = yy, and that for all x e X there exists some y e Y with (x, y) e f. The 
set X is the domain of the function while Y is the codomain. For any set X there 
is always the identity function idx : X — > X given by \dx(x ) = x. Further, given 
functions / : X —> Y and g : Y —> Z, their composition is defined to be the function 
g o f : X — > Z given by (g o f)(x) = g(f(x)). It is obvious that composition of 
functions is associative, that is, if 

f S h 

A i ^ A 2 ^ A3 >■ A 4, 


then 


ho (go f) = (h og)o f. 

A function / : X -> Y is injective or one-to-one if f(x\) — f(x 2) implies xi = xj, 
for all xi , X2 e X. A function / : X — > Y is surjective or onto if for all y e Y there 
exists an x e X such that /(x) = y. A function / : X —> Y is called bijective or 
invertible if it is both injective and surjective. Equivalently, a function / : X — > Y 
is invertible if there exists a function f~ [ : Y —> X, called the inverse of /, such 
that / o /' - = idy and f~ l o f = idx- Such an inverse, if it exists, is easily seen 
to be unique. 

Given a function f : X Y and a subset S c X, the restriction of / to S is the 
function g : S — »■ Y given by ^(.v) = f(s). This restriction function is denoted by 
/|j. We also mention that any two sets Zand Y with X C Y give rise to the inclusion 
function l : X Y , defined by 1 (x ) = x. Note that the inclusion function differs 
from the identity function only by the codomain, not by the functional assignments. 
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1.2.8 Arbitrary Cartesian Products 

The cartesian product of any indexed family { X, }, e / of sets is the set 

{/:/-> U Xi | f(i)e x i, Vie/}. 

iel 

To justify the definition, observe that when I = {1, 2, 3, . . . , m), each function / as 
above can be uniquely identified with the m -tuple (/(l), /(2), . . . , /(/«)), in other 
words, with an element of X\ x • • • x X m , so this definition recovers the definition 
of the cartesian product of finitely many sets, as given above. When X ; = X for all 
i e I, the cartesian product is denoted by X 1 . Notice that X 1 is simply the set of all 
functions /:/—»■ X. 


1.2.9 Direct and Inverse Images 

The power set of a set X is the set of all subsets of X, and is denoted by XX(X). 
Every function / : X -» Y induces the direct image function / : fiXiX) — ► .Z’lT), 
denoted again by /, given by f(A) — { f (a ) | a e A}, for all A C X (notice 
that this definition introduces a new way to construct sets). Similarly, any function 
/ : X Y induces the inverse image function / -1 : XX ( Y) -> PXiX), which, for 
every B C Y , is given by = {x e X | f(x ) e B}. Notice that f~ l (Y ) = X 

but that the inclusion f(X) C Y may be proper. The set f(X) is called the image of 
/. Note that f(X) = Y if, and only if, / is surjective. We remark that the notation 
f~ l for the inverse image conflicts with the notation for the inverse function of / 
(when it exists) and thus some care should be exercised. 

/ 8 

The following properties of functions are easily verified. If X — — >- Y — — >■ Z 
are functions and S' c Z is a subset, then 

C gof)-\S) = f-\g-\s )). 

Further, for all subsets Si, S 2 C Y, 

f~\Si n s 2 ) = / _1 (Si) n f~'(S 2 ) 


and 


/- 1 (S 1 US 2 ) = /- 1 (S 1 )U/-‘(S 2 ). 


More generally, given any collection {S, }; 6 / of subsets of Y, 
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and 


r\C\Si) = f]f-\Si) 

iel iel 

r\\js i ) = \jr l (s i ). 

iel iel 


1.2.10 Indicator Functions 


Let S be the set {0, 1} and fix some set X. The subsets of X are easily seen to be in 
bijective correspondence with the functions X — > S. The correspondence is given as 
follows. Given a subset S C X let F(S) : X -* § be given by 


F(S)(x) = 


if x e S 
ifx i S. 


The function F(S) is called the indicator function of the subset S C X. Conversely, 
given any function / : X —> §, the subset of X it determines is G(f ) = f~ l ({ 1}). It 
is easily verified that F and G are each other’s inverses, setting up the stated bijective 
correspondence. Note that / -1 ({0}) is the complement X — f~ l ({1}). 


1.2.11 Cardinality 

Two sets X and Y are said to have the same cardinality if there exists a bijection 
f : X Y, in which case we write |X| = |Tj. If there is an injection / : X — »■ T, 
then X is said to have cardinality less than or equal to that of Y , and we write 

m < \y\. 

Two finite sets have the same cardinality if, and only if, they have the same number 
of elements. It is thus common to write | Aj for a finite set X to indicate the number 
of elements in X. 

A set X is said to be countable if it is empty or if there exists a surjective function 
/ : N — »■ X. Equivalently, a set is countable if it has the same cardinality as some 
subset of N. Equivalently still, a set X is countable precisely when there exists an 
injective function X -» N. A set that is both countable and infinite is said to be 
countably infinite. A set that is not countable is called uncountable. A set which has 
the same cardinality as R is said to have the cardinality of the continuum. Examples 
of countable sets include every finite set, the set N, as well as the set Z of integers 
and the set Q of rational numbers. Sets of the cardinality of the continuum include 
the set R of real numbers as well as C of complex number, the n-fold products R" 
and C" for any n e N, as well as the power-set ^(N), namely the set of all subsets 
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of natural numbers, and the set of all continuous functions / : R -» R. Uncountable 
sets of cardinality larger than that of the continuum include the power-set YX(R) of 
all subsets of real numbers, as well as the set of all functions / : R. — > R. 


1.2.12 The Cantor-Shroder -Bernstein Theorem 

The result we present now is a very convenient tool in establishing that two sets have 
the same cardinality. 

Theorem 1.1 (Cantor-Shroder-Bernstein) For all sets X and Y, if\X\ < Y\ and 
\Y\ < \X\, then |X| = |T|. 

Proof By the condition in the assertion, there exists an injective function / : X Y 
and an injective function g : Y —> X. To construct a bijection h : X — > Y , we 
consider the behaviour of elements in both X and Y with respect to the given functions 
/ and g. Notice that since g is injective, the inverse image g _1 (xo), for any xo € X, 
is either empty or is of the form {yo}, where g(yo) = xo- Similarly, f~ l (y\), for 
any y\ e Y, is either empty or is of the form {xi} with f(x\) = yi . We thus write 
/ _ 1 (yi) = xi and g _ 1 (xo) = yo, when these sets are not empty. Starting with 
xo e X we may successively consider the sequence 

*o !-► # _1 (*o) = yo >-> f~ l (yo) = xi h* g _1 (-^i) = Ji i-+ ■ ■ • 


where at each step we apply f~ l or g _1 alternatively, if they exist. Such a sequence 
exhibits precisely one of three possible behaviours. It may be infinite, in which case 
xo is said to have type oo, or it may terminate since g ~ 1 (x/j was empty, in which 
case xo is said to have type X . or the sequence may stop due to the fact that f ~ 1 (y^) 
was empty, in which case xo is said to have type Y . In exactly the same way, each 
element of Y can be classified as having precisely one type, either type oo, or type 
X or type Y . 

Let us now denote by Xoo all type oo elements in X, by Xy all type X elements 
in X, and by Xy all type Y elements in X . Similarly, introduce the notation Yoo, Y x , 
and Yy. Thus 


X = XooUXxUXy 


and the union is pairwise disjoint. Similarly we obtain the pairwise disjoint union 


Y =Y 00 UY x UYy. 
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The function given by 


h{x) = 


fix) 

g~ l (x ) 

fix) 


if x e Xx 
if x e X y 
if x e Xoo 


is now easily seen to be a bijection, for instance by constructing an inverse function 
for it. □ 


1.2.13 Countable Arithmetic 

Theorem 1.2 A subset of a countable set is countable. 

Proof Suppose a set X is countable and S C X is a subset. If S — 0, then it is 
countable, so we may assume S f 0 and let us bx some sq e S. Since X is countable, 
there exists a surjection / : N — * X. Clearly, there is a surjection g : X -* S (for 
instance, g(x) = x if x e S, and g(x) = sq if x ^ S) and, since the composition of 
surjective functions is surjective, the function g o / : N — > S is surjective, showing 
that S is countable. □ 

Theorem 1.3 The cartesian product of finitely many countable sets is countable. 

Proof Let X\, . . . , X m be countable sets and choose, for each 1 < k < m, an 
injective function fk : Xu N. These functions together give rise to 

/ : X! x ■ • • x X m N m 


given by 


fix 1 , . ■ ■ , X m ) = (/l(xi), . . . , f m ix m )), 


which is clearly injective. Now fix m distinct prime numbers p i, ... , p m and let 

g : N m -* N 


be given by 


gim,...,n m ) = p[ l • 

The fact that every natural number has an essentially unique decomposition into 
prime factors implies that g is injective. To conclude then, the composition 

Xi x • • • x X m N m — ^ N 


is injective, and thus X] x x X m is countable. 


□ 
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Theorem 1.4 A countable union of countable sets is itself countable. 

Proof Let {X m },„ s n be a countable family of countable sets. We may assume, without 
loss of generality, that the collection is pairwise disjoint and that X m f 0 for all 
m e N. Since each X m is countable, there exists a surjective function /„, : N X m . 
Let 


X= \J X m 
me N 


and define now the function 


g:NxN->I 


by 


g(m, n) = f m (n), 


which is clearly surjective. To finish the proof, note that NxN, being the product of 
two countable sets, is countable, and thus there is a surjection h : N — > N x N. The 
composition 

N — N x N — — ^ X 

is thus a surjection, proving that X is countable. □ 


1.2.14 Relations 

A relation R on a set A is a subset R c X x X, but we write xRy instead of 
(x, y) e R. The relation R is said to be reflexive if xRx holds for all x e X. The 
relation R is symmetric if xRy implies yRx, for all x, y e X. The relation R is 
anti-symmetric if, for all x, y e X, the assertions xRy and yRx together imply that 
x = y. Finally, the relation R is said to be transitive if xRy and y Rz together imply 
xRz, for all rj,ze X. 


1.2.15 Equivalence Relations 

A relation R on a set X is called an equivalence relation if it is reflexive, symmetric, 
and transitive, in which case we denote R by ~, and thus write x ~ y instead of 
xRy. If x ~ y we say that x and y are equivalent. A partition of A is a collection 
{A ;};<=/ of non-empty pairwise disjoint subsets of A such that (J ;g/ A, = A. 
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Given a partition {X,}, s / of X, defining x ~ y precisely when there exists an 
i e I with x, y e A - , is easily seen to define an equivalence relation. If we denote by 
Par(A) the set of all partitions of X and by Equ(X) the set of all equivalence relations 
on X, then the process just described defines a function E : Par (A - ) — > Equ(A). 
Conversely, given an equivalence relation ~ on X we may define, for each x e X, 
the set 

[x] = {y e X \ x ~ y] 

of all those elements y e X which are equivalent to x. This set is known as the 
equivalence class of x, and x is called a representative of the equivalence class [x] 
(if it is important to emphasize the relation then we write [x]~). It can easily be 
verified that two elements x, y e X represent the same equivalence class, that is 
[a] = [y], if, and only if, x ~ y. It is completely straightforward to demonstrate 
that the set {[x] | x e X} of all equivalence classes is a partition of X, and we thus 
obtain a function P : Equ(A') — > Par (A - ). It is quite easy to verify that in fact P is 
the inverse function of E and thus we have established a bijective correspondence 
between equivalence classes on X and partitions of X . 

Given an equivalence relation ~ on a set X, the set {[a] | x e X} is denoted by 
X/~ and is called the quotient set of X modulo There is also the corresponding 
function jt : X — > X/~, given by 7r(x) = [x], called the canonical projection. 


1.2.16 Ordered Sets 

A poset (short for partially ordered set ) is a pair (X, <) where A is a set and < is a 
relation on X which is reflexive, transitive, and anti-symmetric. 

Example 1.1 Given any set X, let &(X) be the set of all subsets of X. Defining, for 
subsets Fi, Yi c X, that Fi < F 2 precisely when Y 1 C F 2 , is easily seen to endow 
^(X) with the structure of a poset. In this case we say that fXlX) is ordered by set 
inclusion. More generally, if A is some collection of subsets of X, then it too can be 
endowed with the ordering induced by set inclusion, in precisely the same manner. 
A related construction is to consider the set P of all pairs (F, /) where F c X and 
/ : F — > Z is a function to some fixed set Z. Defining (Fi, f\j) < (F 2 , / 2 ) when 
Fi C F 2 and when / 2 extends f\ (which means that / 2 (y) = fi(y) holds for all 
y e F 1) is again easily seen to endow P with a poset structure. Again, one may 
consider variants of this construction, for instance by demanding extra conditions on 
the sets F or on the functions /. 

In the context of a general poset, the meaning of x < v is taken to be: x < y 
and x ^ y. The relations x > y and x > y are similarly derived from the given 
relation <. Since not all elements x and y in a poset need be comparable (x and 
y are comparable when either x < y or y < x), the correct interpretation of, for 
instance, the negation of x < y is not that x > y but rather that either x and y are 
incomparable, or x > y. 
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1.2.17 Zorn’s Lemma 

A poset P is said to be a total order or linearly ordered if for all x, y e P either 
x < y or y < x holds. The poset PXIX) discussed in Example 1 . 1 is linearly ordered 
if, and only if, |X| < 1. 

Any subset S C P in a poset ( P, <) inherits a poset structure from P. We also 
say that P induces a poset structure on S. In more detail, the poset S is given by 
the ordering < defined, for all x,yeS,byx<y precisely when x < y in P . We 
usually do not make any notational distinction between < and the induced order <, 
and thus simply write x < y when referring to the ordering in S. 

Definition 1.1 A chain in a poset P is a subset S C P which, with the induced 
ordering, is linearly ordered. An upper bound of a set S C P (be it a chain or not) 
is an element y e P such that x < y holds for all x e .S'. A maximal element in a 
poset P is an element v,m such that vm < x does not hold for any x e P. 

Lemma 1.1 (Zorn’s Lemma) Let P be a non-empty poset. If every chain in P has 
an upper bound, then P has a maximal element. 

Zorn’s Lemma is a powerful result that is more of a proof technique than a lemma, in 
much the same way as proof by induction is a proof technique rather than a lemma. 
A proof of Zorn’s Lemma is subtle and in fact it is well-known that Zorn’s Lemma 
is equivalent to the Axiom of Choice. 


1.2.18 A Typical Application of Zorn’s Lemma 

Given any two sets X and Y , it is natural to wonder if their cardinalities compare. 
That is, is it always the case that either |A| < K|, or | K < A | ? A positive answer 
to this question turns out to be equivalent to the Axiom of Choice. We will only 
prove half of this equivalence, primarily to illustrate a typical application of Zorn’s 
Lemma. 

Theorem 1.5 For all sets X and Y, either |A| < |T| or\Y\ < \X\. 

Proof The case where either X or Y is empty is easily dispensed with, and so we 
proceed under the assumption that both are non-empty. As a first step we construct 
a suitable poset, similar to the one given in Example 1.1. 

Let P be the set of triples (X', /, Y') where X' C A", Y' C Y , and / : X' — > Y' 
is a bijective function. The partial order structure on P is given by 

(A\ /, Y') < (X", g, Y") 


precisely when 
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X' C X" 

Y' c Y" 

g(x ) = fix) for all x e X' . 

It is straightforward to verify that this is indeed a partial order. 

Next, we show that (P, <) satisfies the conditions of Zorn’s Lemma. Firstly, 
P ^ 0 since the triple (0, 0, 0) is always in P. Next, suppose that {(X;, /,-, F,)} ie / 
is a chain in P, and we shall construct an upper bound for it. Let 


X 0 = U X, 
iel 

Fo=U^' 

iel 


and / : Xo — > Yq given by 


/O) = /^00- 


Let us explain the definition of the function /. Given any x e Xo, there is an i x e I 
such that x e X ix . We may thus consider the value fi x {x) e Yi x C Yq. However, 
there may be another index e I with x e X, - , and a-priori, 

fi' x (x) # 

is a possibility. However, since { (X; , /; , K, ) j, g / is a chain, we may assume, without 
loss of generality, that (X lx , f) s , Y , x ) < ( X, - , /,■< , ). But then, the definition of < 
in P implies that 

/i;0) = fi x (x) 


and so the function / : Xo — Yq above is well-defined. A similar argument shows 
that / is bijective. 

Now that we have verified the conditions of Zorn’s Lemma we are guaranteed of 
the existence of a maximal element (X«, . Ym) e P. That is, fM '■ X y/ — > Ym 

is a bijective function for some Xm C X and Kyf C Y . Let us now entertain the 
possibility that both Xm and Ym are proper subsets. That is, that there exist elements 
x\ e X — Xm and y\ e Y — Ym- But then we may define 

Xt = X M U {xi} 

Y\ = Y m U {yi} 

and f : Xi — > Fi by 
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f(x) = 


/mW 

?! 


if a- e Xm 
if x = x\ 


giving rise to the element (Xi , f,Y\) e P (the reader is invited to verify membership 
in P) with 


(X M ,f M ,Y M ) < (X,,/,,F,), 

contradicting the maximality of (Xm, /if, Ym). 

We thus conclude that either Xm = X ox Ym = F. If Xm = X, then the 

composition X — Ym ‘ nU ' > Y with the inclusion function yields an injec- 
tion X — y F, thus showing that |X| < | F | . If Ym = F, then the composition 

j? — 1 . i 

Y — — >■ X m — L >■ X with the inclusion function is an injection F — > X, showing 
that | F | < | X | , and completing the proof. □ 


1.2.19 The Real Numbers 

The set R of real numbers may be constructed in numerous different ways and can 
also be characterized axiomatically in different ways. We will not concern ourselves 
here with the construction of a model of the reals, and thus accept their existence 
and only state the axioms that govern them, namely that the reals form a Dedekind 
complete totally ordered field. 

The statement that R, with addition and multiplication, is a field is the claim that 
the following axioms hold: 

• Associativity of addition — for all a, b, c e R: (a + b) + c = a + (b + c). 

• Commutativity of addition — for all a, b e R: a + b = b + a. 

• Neutrality of 0 — for allaeR:a + 0 = a = 0 + a. 

• Existence of additive inverses — for all a e R there exists an element b e R with 
a + b = 0 = b + a. 

• Associativity of multiplication — for all a, b, c e R: (a ■ b) ■ c = a ■ (b ■ c). 

• Commutativity of multiplication — for all a, b e R: a ■ b = b ■ a. 

• Neutrality of 1 — for all a e R: 1 • a = a — a ■ 1. 

• Existence of reciprocals — for all a e R, if a ^ 0, then there exists an element 
be R with a ■ b — 1 = b ■ a. 

• Distributivity — for all a, b, c e R: a ■ (b + c) = a ■ b + a ■ c. 

In general, any set K with addition and multiplication operations, and distinguished 
elements 0 ^ 1 in K, that satisfy these axioms is said to be afield. For instance, both 
C and Q, with the usual notion of addition and multiplication, are fields. In contrast, 
Z with the same familiar addition and multiplication is not a field. 
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The statement that the reals form an ordered field is the claim that with the usual 
notion of a < b for the real numbers, K is a totally ordered set, and that the following 
axioms hold: 

• For all a, b, c e R: a < b => a + c < b + c. 

• For all a, b, c e R. with c > 0: a < b => a ■ c < b ■ c. 

In general, a field K with a total order on it such that these two axioms hold is called 
an ordered field. For instance, Q with the usual ordering is an ordered field. For the 
field C, however, it can be shown without too much difficulty that no ordering of it 
exists that turns it into an ordered field. 

Finally, perhaps the most important property of the real numbers, and certainly 
one that sets it apart among all ordered fields, is its Dedekind completeness. Recall 
that an upper bound for a subset S C R is a real number a such that s < a for 

all s e S. A supremum (or least upper bound) of S is an upper bound a with the 

property that if b is any upper bound of S, then a < b. Similarly, one defines the 
infimum (or greatest lower bound ) of S to be a lower bound which is not smaller 
than any other lower bound. The statement that R is Dedekind complete is the claim: 
any non-empty bounded above subset has a supremum. It can then be shown 

that this property is equivalent to: any non-empty bounded below subset XcR has 
an infimum. 

In general, any Dedekind complete ordered field can be taken as essentially the 
field of the real numbers. In other words, the axioms listed above of a Dedekind 
complete ordered field determine the reals up to an isomorphism, meaning that the 
only difference between any two Dedekind complete ordered fields is the naming of 
the elements. 

Further Reading 

The closing section of this chapter (and of each of the forthcoming chapters) 
consists of suggestions for further reading. The list of sources is deliberately kept 
short and is thus by necessity not comprehensive. It aims to be a starting point for the 
reader interested in learning more about particular aspects of the chapter that were, 
for whatever reason, not elaborated upon in the main text. 

For a broad historical perspective on the development of modern mathematics, 
including a detailed discussion of the birth of modern mathematical analysis, see [11]. 
The reader interested in the interplay between mathematics as a formal system and 
the real world, and the student baffled by the usefulness of mathematics in physics, 
is referred to [8]. For a discussion of the internal forces governing and influencing 
the development of mathematics see the classic text [7]. 

For a more in-depth treatment of set theory the reader may consult the book [9] 
which is also an introduction to logic and proof theory. This book also includes 
a detailed and elementary presentation of the Axiom of Choice and some of its 
equivalents, i.e., Zorn’s Lemma, Zermelo’s well-ordering principle, and the principle 
of cardinal comparability, as well as an elementary treatment of cardinal arithmetic. 
Various books (e.g., [6, 10]) address the Axiom of Choice from a historical point- 
of-view, discussing its influence on and within mathematics. 
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For more on the various possibilities of constructing the real numbers, see the 
review given in [5]. Some of these constructions refer to techniques of nonstandard 
analysis, i.e., to models of real numbers which contain actual infinitesimals. The 
reader interested in this somewhat controversial approach to analysis may consult 
the books [1, 3, 4] as well as [2] giving an eightfold path to the subject. 
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Chapter 2 

Linear Spaces 


Abstract This chapter is a rigorous introduction to linear spaces, but with a strong 
emphasis on infinite dimensional linear spaces. No prior knowledge of linear spaces 
is assumed, so that all definitions and proofs are included, but some mathematical 
maturity is assumed, dictating the level of detail given. In particular, all results (e.g., 
existence of dimension) are given in full generality using Zorn’s Lemma. 

Keywords Linear space • Vector space • Linear transformation • Operator • Dimen- 
sion • Quotient linear space • Product linear space • Inner product space • Normed 
space • Cauchy-Schwarz inequality 


This chapter assumes a rudimentary understanding of the linear structure of M" , 
a very rich structure, both algebraically and geometrically. Elements in M" , when 
thought of as vectors, that is as entities representing direction and magnitude, can 
be used to form parallelograms, can be scaled, the angle between two vectors can 
be computed, and the length of a vector can be found. These geometric features are 
given algebraically by means of, respectively, vector addition, scalar multiplication, 
the inner product of two vectors, and the norm of a vector. In this chapter these notions 
are abstracted to give rise to the concepts of linear space, inner product space, and 
normed space. 

The chapter gives a detailed presentation of all of the relevant notions of linear 
spaces, provides examples, and contains rigorous proofs of all of the results therein. 
Exploiting the assumption of a rudimentary understanding of the linear structure of 
M", and thus of finite dimensional linear spaces, the chapter has a clear secondary 
goal, namely to explore the subtleties of infinite dimensional linear spaces. This 
is an absolute necessity with Hilbert spaces in mind, since virtually all interesting 
examples of Hilbert spaces are infinite dimensional. 

A consequence of this infinite dimensional theme of the chapter is that some 
proofs are considerably more involved than their finite dimensional counterparts. 
In particular, the theorems establishing the existence of bases and the concept of 
dimension are sophisticated and resort to an application of Zorn’s Lemma. Also, 
cardinality considerations are important, since one needs to be able to compute, at 
least a little bit, with infinite quantities. To facilitate an easier reading of this chapter, 
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the Preliminaries contain an account of Zorn’s Lemma (Sect. 1.2.17) and some basic 
cardinal arithmetic (Sects. 1.2.12 and 1.2.13). Moreover, the text below indicates 
which proofs can safely be skipped on a first reading, leaving it to the discretion of 
the reader when, and whether, to tackle the technicalities involved in mastering the 
more advanced techniques. 

Section 2.1 introduces the axioms of linear spaces in full generality, establishes 
basic properties and explores examples, most of which will be revisited throughout 
the book. Section 2.2 is concerned with establishing the notion of dimension for an 
arbitrary linear space. In particular, the dimension need not be finite, which, at times, 
necessitates some more intricate proofs. Section 2.3 discusses linear operators, the 
natural choice of structure preserving functions between linear spaces, studies their 
basic properties, and discusses the notion of isomorphic linear spaces. Section 2.4 
introduces standard constructions producing new spaces from given ones, and in 
particular the kernel and image of a linear operator are discussed. The final section 
is devoted to inner product spaces and normed spaces and presents several important 
examples such as function spaces and £ p spaces. 


2.1 Linear Spaces — Elementary Properties and Examples 

In R 2 , the scalar product a ■ x is between a real number uel and a vector ret 2 , 
and yields again a vector in R 2 . However, in similar situations, such as in C 2 , the 
scalar multiplication a ■ x is now defined between any complex number a e C and 
an arbitrary vector x. The most general situation is when a is allowed to vary over 
the elements of an arbitrary (but fixed) field K. Prominent examples of fields are the 
field R of real numbers and the field C of complex numbers (for a more detailed 
discussion of fields the reader is referred to Sect. 1.2.19 of the Preliminaries). The 
definition of linear space given below is the result of a distillation of certain key 
properties of vector addition and scalar multiplication in R 2 or R 3 , and is formalized 
in the most general form, namely with arbitrary fields. If the reader is not familiar 
with any fields other than R or C (or Q, the field of rational numbers), then it is 
perfectly safe to proceed and replace any occurrence of an arbitrary field K by either 
R or C. In the context of this book these are in any case the most important fields. 

Definition 2.1 (Linear Space) Let K be an arbitrary field (such as the field of real 
numbers or of complex numbers, for concrete examples). A set V of elements 
x,y,z, ■■ ■ of an arbitrary nature, together with an operation, called vector addi- 
tion, or simply addition, associating with any two elements x, y e V an element 
z e V, called the sum of x and y, and denoted by z = x + y, as well as an operation 
associating with any x e V and a e K an element w e V, called the product or 
scalar product of a and x, and denoted by w = a ■ x, is called a linear space if 

1. For all x, y, z e V, the operation of vector addition satisfies: 

• Associativity, i.e., (x + y) + z, = x + (y + z). 
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• Commutativity, i.e., x + y — y + x. 

• Existence of a neutral element, i.e., there exists an element 0 e V , for which 

x + 0 — x = 0 + x. 

• Existence of additive inverses , i.e., there exists an element x' e V such that 


x + x' = 0 = x' + x. 


2. For all x e V and a, ft e K, the scalar product operation satisfies: 

• Associativity, i.e., a ■ (ft ■ x) = (a/3) ■ x. 

• Neutrality of 1 e K , i.e., 1 ■ x — x. 

3. For all x, y e V and a, ft e K, the scalar product and vector addition operations 
are compatible in the sense that 

• Scalar product distributes over vector addition 


a ■ (x + y) = a ■ x + a ■ y. 


• Scalar product distributes over scalar addition 


(a + P) - x — a-x + fl-x. 

A linear space is also called a vector space, its elements are called vectors, and the 
elements of the field K are called scalars. If we wish to emphasize the relevant field, 
then we say that V is a linear space over K. Otherwise, the assertion that V is a 
linear space includes the implicit introduction of a field K serving as the field of 
scalars for V. 

Scalars will typically be denoted by lower-case Greek letters from the beginning 
of the alphabet, namely a, /3, y, and so on, while vectors will be denoted by x, y, z, 
etc. In either case, subscripts or superscripts may be used to enhance readability. 


2.1.1 Elementary Properties of Linear Spaces 

We now turn to establish several properties of linear spaces that immediately follow 
from the axioms. 

Proposition 2.1 In any linear space V the following statements hold. 

1. The neutral element 0 £ V is unique. 

2. For all x e V, the additive inverse x' is unique. 

3. For all a e K and x 6 V, the equation a ■ x = 0 holds if, and only if a = 0 or 
x — 0. 
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Proof 

1 . Suppose that O' e V is a neutral element. That is 


x + 0 / = x 


for all x e V, and thus 

0 = 0 + O' = O' + 0 = O'. 


2. Suppose that x" e V satisfies that x + x" = 0, then 


x' — x + 0 = x + (x + x") = (x' + x) + x" — 0 + x" = x" . 


3. For a = 0 


a!-x = 0- x = (0 + 0)-x = 0- x+ 0- x => 0 • x = 0. 


For x = 0 


a-x = ai-0 = Q;-(0 + 0) = Q?'0 + Q'-0 => a ■ 0 = 0. 


In the other direction, if a ■ x = 0 and a f 0, then upon multiplication by a 1 , 
one obtains 

x = 1 • x = (a -1 • a) ■ x — a -1 • (a ■ x) = a~ l -0 = 0. □ 

Remark 2. 1 It similarly follows that for any vector x , the additive inverse x' is given 
by x' — (—1) • x. It is further common to neglect the • denoting the scalar product, 
write x' = — x and resort to the familiar conventions for algebraic manipulations on 
vectors and scalars commonly used for addition and multiplication, e.g., we write 
x — y for x + (— y) or x + y + z for (x + y) + z, and so on. This convention, of 
course, considerably shortens proofs such as those given above and will be silently 
used throughout the text below. 

Remark 2.2 Most of the linear spaces we shall be concerned with will be over the 
field R of real numbers, in which case they are called real linear spaces, or over the 
field C of complex numbers, in which case they are called complex linear spaces. 
In the absence of further specification, or as implied by context, a linear space is 
assumed to be either a real or a complex linear space. 

Remark 2.3 We make no notational distinction between the zero vector, i.e., the 
neutral element with respect to vector addition, and the element 0 in the field K. 
This is a generally safe practice, since context typically immediately points to the 
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correct interpretation. For instance, in the equation 0 • x = 0, context dictates that 
the 0 on the left-hand-side is the scalar 0 e K while on the right-hand- side it is the 
zero vector 0 e V . 


2.1.2 Examples of Linear Spaces 

Since linear spaces are very common in mathematics, presenting an exhaustive list 
of linear spaces is a daunting task. The chosen examples below are meant to present 
some commonly occurring linear spaces, to explore some less common possibilities, 
and to familiarize the reader with some linear spaces of great importance in the 
context of this book. The latter refers to linear spaces of sequences and of functions, 
linear spaces that will be revisited throughout the rest of the text. 

Example 2.1 In the familiar spaces R 2 and R 3 , which are the mathematical models 
of the physical plane and space, vector addition has the geometric interpretation 
known as the parallelogram law, while scalar multiplication ax has a scaling effect 
on the vector x determined by the magnitude and sign of a. 

Example 2.2 Given a natural number n > 1, let R" be the set of /(-tuples of real 
numbers. Thus if x e R", then it is of the form x = {x \, . . . , x n ) where (for every 
l < k < n) Xk, the k-th component of x, is a real number. 

For all x , y e M" and a e R defining 


x + y = (xi + yt , X2 + yi, . . . , x n + y n ) 
ax = {ax \ , ax 2 , . . . , ax„), 

endows R" with the structure of a linear space over the field R, as is easy to verify. 
Obviously, the cases n = 2 and n = 3 recover the familiar linear spaces R 2 and 
R 3 (respectively). Analogously, the set C" of all //-tuples of complex numbers, with 
similar coordinate-wise operations, is a linear space over the field C. 

Example 2.3 Let R°° be the set of infinite sequences of real numbers. Thus, a typical 
element x e R°° is of the form x = (xi , X 2 , . . . , Xk, . ■ .) where Xk e R, for each 
k > 1, is a real number, the k-th component of x. For all x, y e R°° and a e R 
setting 


x + y = (xi + yi, x 2 + yi, ■ ■ ■ , x k + yk, ■ ■ •) 

ax = {ax 1 , a.X2, . ■ ■ , axk , . . .) 

endows R°° with the structure of a linear space over the field R, as is easily seen. 
Similarly, the set C°° of all infinite sequences of complex numbers, with similar 
coordinate-wise operations, is a linear space over the field C. 
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Remark 2.4 The convention that for an element x in either R", C", K°°, or C°° its 
£-th component is denoted by x& (as illustrated in the preceding examples) will be 
used throughout this text. To refer to a sequence of such vectors we may thus use 
super scripts for the different vectors, e.g., {x < " i) ),„> \ , and then 1 refers to the 
&>th component of the m-th vector. 

Example 2.4 Consider the subset c C C°° consisting of all convergent sequences 
(here convergence is in the usual sense of convergence of sequences of complex 
numbers), let co C c be the subset consisting of all sequences that converge to 0, and 
let coo ^ co be the set 

coo = {Oi, . . . , x m , 0, 0, 0, . . .) | m > 1, x\, ... ,x m e C} 

of all sequences that are eventually 0. With the same definition of addition and 
multiplication as in Example 2.3, each of these sets is easily seen to be a linear space 
over C. Obviously, replacing C throughout by R yields similar linear spaces over R. 

Example 2.5 The constructions given above of the linear spaces R", C'\ R°°, and 
C°° easily generalize to any field K and to any cardinality. Indeed, consider an 
arbitrary set B and an arbitrary field K. Recall from the Preliminaries (Sect. 1.2.8) 
that the set K 8 is the set of all functions x : B — > K. For all such functions 

X-.B-+K, y:B-+K 


define the functions x + y : B — > K and ax : B -> K by 

(x + y)(b) = x(b) + y(b), (ax)(b) = a ■ x(b). 

With these notions of addition and scalar multiplication, the set K 8 is easily seen to 

be a linear space over K. In particular, we obtain the linear spaces R* 1,2 = R", 

C* 1 - 2 "> = C", R n = R°°, and C N = C°°. Further, restricting attention to the 

subset (K 8 ) o consisting only of those functions x : B K for which x(h) = 0 for 
all but finitely many b e B also yields a linear space, with exactly the same definition 
for addition and scalar multiplication as in K 8 . In particular the linear space coo 
from Example 2.4 is recovered by noticing that coo = (C N )o. Note however that for 
a general field K no analogue of the linear spaces c or c o need exist since K need 
not have any notion of convergence. 

Example 2. 6 For n > 0, let P n be the set of all polynomial functions, that is functions 
of the form p(t) = a n t n + ■ ■ ■ + a\t + ao, with real coefficients and degree at most 
n . With the ordinary operations 


(p + q)(t) = pit) + <?(0, (ap)(0 = « • Pit ) 
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taken as addition and scalar multiplication, it is immediate to verify that P n is a 
linear space over R. Removing the restriction on the degrees of the polynomials one 
obtains the set P of all polynomial functions with real coefficients which, again with 
the obvious notions of addition and scalar multiplication, forms a linear space over 
R. One may also consider polynomials with coefficients in the field C of complex 
numbers, and obtain similar linear spaces over C. 

Example 2.7 Let 7 be a subset of R which is either an open interval (a, b ), a closed 
interval [a, b], or the entire real line R. Consider the set C(I, R) of all continuous 
real- valued functions x : I — * R. The familiar definitions 

(x + v)(r) = x(t) + y(t), ( ax)(t ) = a ■ x(t), 

when applied to continuous functions x, y : I R, are well-known to give con- 
tinuous functions again, and it is easy to see that when these operations are taken as 
addition and scalar multiplication, the set C (/, R) is a linear space over R. One may 
also consider the set C(7, C) of all continuous complex- valued functions x : I -> C 
to similarly obtain a linear space over C. One may also consider, for each k > 1, the 
set C k (I, R) of all functions x : I —> R with a continuous k-th derivative, which is 
similarly a linear space over R. It is then customary to equate C°(7, R) with C(7, R). 
We may also allow k — oo so as to obtain C°°(7, R), the linear space of all infinitely 
differentiable functions. 

Example 2.8 Let K be any field and F a proper subfield of K . namely P is a proper 
subset of K, and with the induced operations from K it is itself a field. For instance, 
Q is a subfield of R which in turn is a subfield of C. In such a situation any linear 
space V over the larger field K is also a different linear space over the smaller field F. 
The reason is that, of course, the additive structure of V is unaffected by the choice 
of field of scalars, whereas the axioms relating to the scalar product, if they hold for 
scalars ranging over K , then they certainly hold for scalars ranging over F (since all 
scalar axioms are universal equational quantifications). The process of considering 
a linear space over K as a linear space over F is named restriction of scalars. 

In particular, each of the examples above of a linear space over C is also a linear 
space over R, and also a linear space over Q. Any linear space obtained by restriction 
of scalars F C K is (except for trivial cases) very different than the original space. 
Another particular instance of restriction of scalars is the observation that since R is 
a linear space over itself (if this is confusing pause for a minute to realize that this is 
essentially the assertion that R 1 is a linear space), restriction of scalars implies that 
R is also a linear space over Q. 

Exercises 

Exercise 2.1 Let V be a linear space over K and / : V — > X a bijection where X 
is a set with no a-priori extra structure. Let / -l : X — > V be the inverse function of 
/. Prove that the operations 
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and 

ax = f (a/ _1 (.r)) , 

defined for all x, y e X and a e K , turn X into a linear space over K. 

Exercise 2.2 For any a e C let c a be the set of all sequences (x i , xi, ■ . ■) e C°° 
which converge to a. Prove that the linear structure of C°° restricts to a linear structure 
on c a if, and only, if a = 0. 

Exercise 2.3 Let K be a field (such as R or C for familiarity) and let M nm (K) be 
the set of n x m matrices with entries in the field K. Prove that with ordinary matrix 
addition and scalar product, the set M nm {K) is a linear space over K. 

Exercise 2.4 With the usual operations of addition and multiplication, is the set 
(K — Q) U {0} a linear space over Q? 

Exercise 2.5 Let V be a linear space over K and fix some vector wo e V. Define on 
the set V an addition operation by jc © y = x + y — wq and a scalar product operation 
by a © x — a (x — wq). Prove that with these operations V is a linear space over K 
whose zero vector is wo. 

Exercise 2.6 Prove that the set R. of real numbers with addition given by v©y = xy 
and scalar multiplication given by a Q y = y a is a linear space over the field R. 

Exercise 2.7 Let X be a set with one element. Prove that for any field K there is a 
unique choice of operations that turns X into a linear space over K . 

Exercise 2.8 Prove that any linear space over R has either just a single vector or 
infinitely many. 

Exercise 2.9 Let V be a linear space. Prove that — x = (— 1) • x holds for all vectors 
x e V. 

Exercise 2.10 Let K be a field and 33 an arbitrary set. Consider the set {38) k of all 
formal expressions of the form 


y. <*b ■ t> 

where a/, e K for each b e 33 and at most finitely many of the a b are non-zero. 
Verify that the obvious way to define addition and scalar multiplication on the set 
{33) k turns it into a linear space over K (known as the free linear space generated 
by 33). 
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2.2 The Dimension of a Linear Space 

The familiar linear spaces R" are all infinite sets (in fact, they all have the same 
cardinality, the cardinality c of the continuum). However, it is intuitively clear that 
R 3 is considerably ‘larger’ than, say, R 2 . This fact is usually expressed in the claim 
that R 3 has dimension 3, while R 2 has dimension 2. The notion of dimension in 
general linear spaces is rather subtle, especially for the infinite dimensional ones. 
We now attend to investigate this situation in full generality, starting with the related 
concepts of linear independence and spanning sets. 


2.2.1 Linear Independence, Spanning Sets, and Bases 

By forming linear combinations of vectors from any given set of vectors in a linear 
space, one obtains a (potentially) large collection of vectors. The original set is said 
to be linearly independent if, intuitively, it does not allow for any redundancies when 
forming linear combinations while it is a spanning set if it is sufficiently large to 
generate any vector as a linear combination. The precise definitions follow. 

Definition 2.2 Given any set S C V of vectors in a linear space V (S may be finite 
or infinite), a linear combination of elements of S is any vector of the form 


m 



*=1 


with xi, , x m vectors from S and a\, ... , a m e K arbitrary scalars. Equivalently, 



with a s — 0 for all but finitely many s e S. The two forms are essentially the same, 
differing only notationally. The span of S is then the set of all linear combinations 
of elements from S, and S is said to be a spanning set if its span is the entire linear 
space V. A spanning set S is a minimal spanning set if it is itself a spanning set 
but no proper subset of it is a spanning set. Further, S is said to be a set of linearly 
independent vectors if the only possibility of expressing the zero vector as a linear 
combination of elements from S is the trivial linear combination, i.e., where all the 
coefficients are 0. That is, S is linearly independent if whenever one has 
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with x\, , x m vectors in S, then necessarily — 0 for all 1 < k < m. The set S 
is said to be a maximal linearly independent set if it is itself linearly independent but 
any set that properly contains it is not linearly independent. A set that is not linearly 
independent is also referred to as a linearly dependent set. Finally, a set that is both 
a spanning set and is linearly independent is said to be a basis of the linear space. 

Remark 2. 5 We speak of vectors x i , . . . , x m e V as being either spanning or linearly 
independent, if the set {jti, . . . , x m } is spanning or linearly independent. Of course, 
we may also consider countably infinitely many vectors x\ , X 2 , ■ ■ . as being spanning 
or linearly independent, in a similar fashion. 

Example 2. 9 The situation in R" is probably very familiar to the reader. Any m 
vectors x\ , . . . , x m in R" are linearly independent if, and only if, the equation 


m 



k = 1 


admits the unique solution = a .2 = • • • = a m = 0. It is well-known that linear 
independence implies m < n. Similarly, the given vectors are spanning if, and only 
if, for every vector b e R" the equation 


m 



k= 1 


admits a solution. It is again a familiar fact that if the given vectors are spanning, 
then m > n. It thus follows that a basis for R" must consist of precisely n vectors. 
In particular, all bases have the same size, namely n , which is referred to as the 
dimension of R" . Below we prove that every linear space has a dimension, if we 
allow infinite cardinalities into the picture. The result in that generality subsumes the 
properties of R” just mentioned. 

Example 2.10 In the space C 2 , considered as a linear space over C, the vectors (1,0) 
and (0, 1) are immediately seen to form a basis. However, if C 2 is considered as a 
linear space over R (by the procedure of restriction of scalars from Example 2.8), 
then these two vectors are (of course) still linearly independent but they fail to span 
C 2 . Indeed, since only real scalars may now be used to form linear combinations of 
these vectors, the span will only be R 2 . To obtain a basis, the two vectors need to be 
augmented, for instance, by the vectors (i, 0) and (0, i). The four vectors together 
do form a basis of C 2 . 

Example 2.11 In the linear space R" or C", the vectors x \ , . . . , x n where 


x k = (0, ...,0, 1,0 0) 
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with 1 in the A-th position, are easily seen to be spanning and linearly independent, 
and thus form a basis, called the standard basis of R", respectively C". It is obvious 
that R" and C" have infinitely many bases. 

In the examples presented so far it was quite straightforward to obtain a basis. The 
following example shows that this is not always the case. In fact, it is not even clear 
that the next linear space even has a basis. 

Example 2.12 Let C°° be the linear space from Example 2.3. The vectors x\, X 2 , ■ ■ ■, 
given by 

Xk = ( 0 , . . . , 0 , 1 , 0 , . . .) 

with 1 in the A-th position are easily seen to be linearly independent, but they do 
not form a spanning set. Indeed, since the span consists only of the finite linear 
combinations of vectors from the set, the span in this case is the set of all vectors of 
the form (a \, . . . , ak ■ 0, . . . , 0, 0, . . .), namely those infinite sequences of complex 
numbers that are eventually 0. In other words, the span in C°° of the vectors x \ , X 2 , . . . 
is the space coo from Example 2.4, and thus we incidentally found a basis for coo. 
It is now tempting to proceed as follows. Taking any vector y\ e R°° which is not 
spanned by xi, X 2 , ■ • for instance the vector yi = (1, 1, 1, . . . , 1, . . .), forming the 
set {yi, xi, X 2 , ■ ■ .} must get us closer to obtaining a basis. Indeed, the new set is still 
linearly independent precisely because yi was not spanned by the rest of the vectors. 
But, this new set is still not a basis as there are still many vectors it fails to span, for 
instance the vector yj = (1, 0, 1, 0, 1,0,.. .). Of course, we may now consider the 
larger set {yi, yj, x\, X 2 , . . .}, but it too fails to be abasis. One may attempt to resolve 
the argument once and for all by claiming that proceeding in this way to infinity will 
eventually result in a basis. However, this is a very vague statement, and even if this 
process can be carried out mathematically (which it can, as will be shown below), it 
is entirely unclear as to which vectors will end up in the basis and which will not. In 
other words, even if this linear space has a basis, it is unlikely we can ever present 
one. 

Naturally, similar observations hold true for R°° instead of C°°, and in fact to 
most of the linear spaces in this book, and in analysis in general. 

Example 2.13 Recall (see Example 2.8) that one can consider the space R as a 
linear space over the field Q of rational numbers (as a particular case of restriction 
of scalars). A real number a is said to be transcendental if it is not the root of a 
polynomial with rational coefficients. Examples of transcendental numbers include 
e and tc , though the proofs are far from trivial. For any transcendental number x 
the set {1, x, x 2 , x 3 , . . .} is linearly independent (this is basically the definition of x 
being transcendental), but it is not a spanning set. 

Example 2.14 Recall the space P n from Example 2.6 of all polynomial functions 
with real coefficients of degree at most n. For every k > 0 let pk be the vector 
Pk{t) = t k . Clearly, the vectors po, p\ . pj, . . . , p„ form a basis for P n , but again 
there are infinitely many other choices of bases. The space P of all polynomial 
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functions with real coefficients has a countable basis given by po, p\ , p 2 , . . a fact 
that is easily verified. Noticing that polynomials are continuous functions, no matter 
on which interval they are defined, we see by the above that in the linear space C(7, R) 
of all continuous functions x : I —> R, where 7 is a non-degenerate interval (i.e., not 
reducing to a point) the vectors po, pi, p 2 , ■ ■ ■ are linearly independent. However, 
they do not form a basis since any linear combination of these vectors is again 
a polynomial function, but not all continuous functions are polynomial functions. 
Once more, it is not at all clear that C (I, R) even has a basis (what would one look 
like?). 

From the discussion above we see that in some linear spaces, such as R", P n , or 
P , it is quite easy to find bases while in other linear spaces, such as R°°, R over Q, or 
C(7, R), it is a highly non-trivial task. In more detail, we saw that it is rather simple 
to exhibit a large set of linearly independent vectors, but it is not so easy to have 
these vectors also span the entire space (the spaces coo and P, as discussed above, 
are the exception to the rule). In fact, it is not at all clear that one can find bases for 
V in the case of R°° or C°°, as well as for R as a linear space over Q, or for C (I, R). 

2.2.2 Existence of Bases 

In order to establish that every linear space does have a basis, we observe some 
immediate facts. It should be noted at once that the existence proof uses Zorn’s 
Lemma in an essential way. That is, it can be shown that if every linear space has 
a basis, then the Axiom of Choice holds. Consequently, the existence proof is not 
constructive. 

Proposition 2.2 Let S C V be a set of vectors in a linear space V . The following 
conditions are equivalent. 

1. S is a maximal linearly independent set. 

2. S is a minimal spanning set. 

3. S is a basis. 

Proof First we show that if S is a maximal linearly independent set, then it is a basis. 
All that is needed is to show that S is a spanning set. To that end, let x e V be a 
vector in the ambient linear space. If x e S, then it is certainly spanned by S. If 
x £ S, then by virtue of S being a maximal linearly independent set, the set S U {x} 
is linearly dependent. Thus, there exist vectors x\, ... ,x m e S U (x) and non-zero 
scalars a\ , . . . , a m e K with 


m 



k= 1 


However, in the expression above it must be that x itself appears as a summand since 
otherwise we would have expressed 0 as a non-trivial linear combination of vectors 
from the linearly independent set S. We may thus isolate x in the expression above to 
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obtain it as a linear combination of elements from S. As x was arbitrary, we conclude 
that S is a spanning set. 

Next we show that if S is a minimal spanning set, then it is a basis. All that 
is needed is to show that S is linearly independent, and indeed, if it were linearly 
dependent, then we would have some expression as above, giving 0 as a non-trivial 
linear combination of vectors from S. Using that expression we are able to isolate 
some vector x e S and exhibit it as a linear combination of other vectors from S. 
It is then easy to see that S — {l} is still a spanning set (simply since any linear 
combination containing x can be replaced by one that does not). But this contradicts 
S being a minimal spanning set, and thus S must be linearly independent. 

So far we have shown that each of conditions 1 and 2 implies condition 3. The 
proof will be completed by showing the converse of these implications. The details 
are very similar in spirit to those given so far, and thus the rest of the proof is left for 
the reader. □ 

We are now ready to establish that every linear space has a basis. The proof 
makes essential use of Zorn’s Lemma (see Sect. 1.2.17 of the Preliminaries) and the 
reader may safely choose to skip the proof on a first reading. To the reader interested 
in the technique of Zorn’s Lemma we remark that the proof below is actually a 
straightforward application with little technical difficulties, and is thus a fortunate 
first encounter with this important proof technique. 

Theorem 2.1 Every linearly independent set A C V in a linear space V can be 
extended to a maximal linearly independent set. In particular, by considering the 
linearly independent set 0 C V and by Proposition 2.2, it follows that every linear 
space V has a basis. 

Proof Consider the set P of all linearly independent subsets S of V that contain 
A, which we order by set inclusion. Evidently P is a poset, and to find a maximal 
linearly independent set that extends A amounts to finding a maximal element in P, 
and so we apply Zorn’s Lemma (Sect. 1.2.17 of the Preliminaries). First we note that 
P is certainly not empty since clearly A e P. Now, assume that {S)} iS / is a chain in 
P. We will show that S = U;e/ $ is an upper bound for the chain. Clearly S, C S 
for all i e /, thus the only thing to show is that S e P, namely that S contains A 
(which is immediate) and that it is linearly independent. For that, assume that 


m 



is a non-trivial linear combination of the vectors xi , . . . , x m e S. Since S is the union 
of the Si, it follows that x m e 5/( m ) for a suitable f(m) e /, but since {S,},- 6 / is a 
chain it follows that there is a single index i o e I such that x\, , x m e Si 0 . But 
then the equality above expresses the zero vector as a non-trivial linear combination 
of vectors from ,S) 0 , contradicting the fact that 5,- 0 is linearly independent. With that 
the conditions of Zorn’s Lemma are satisfied, and so the existence of a maximal 
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element in P is guaranteed. This maximal element is a set Sm e P, namely Sm 
contains A and Sm is a maximal set of linearly independent vectors, as required. □ 


2.2.3 Existence of Dimension 

Now that we know that every linear space has at least one basis, it is tempting to 
define the dimension of a linear space to be the cardinality of its basis. However, 
for this to make sense we need to know that all bases have the same cardinality. 
While this is a very plausible assertion (certainly for such familiar spaces as R"), it 
requires a careful proof, particularly in the infinite dimensional case. The important 
ingredient is the following lemma, which states that the cardinality of any linearly 
independent set is not greater than the cardinality of any spanning set. This result 
again uses Zorn’s Lemma and its proof may safely be skipped on a first reading. A 
word of caution to the ambitious reader that this proof, compared to the proof of 
Theorem 2.1, is technically more demanding. 

Lemma 2.1 If V is a linear space, I C V a linearly independent set of vectors, and 
S c V a spanning set, then there exists an injective function f : I S. 

Proof Consider a pair ( J , /) where J C I and / : J — >■ .S’ is an injection. The 
idea will be (using Zorn’s Lemma) to keep on extending the domain of / until the 
entire set I is exhausted. An important ingredient for achieving that is the following. 
Thinking of the injective function / : J — > S as an instruction to replace the vectors 
in J by their images in S, we only consider such pairs ( J , /) for which (/ — J)Uf(J) 
is still a linearly independent set. Let us now form the set P of all such pairs and 
introduce an ordering on it declaring that (7i, f\) < (J 2 , fi) precisely when J\ C J 2 
and fi extends f\ (the latter means that f\ (x ) — fi(x) holds for all x e J 1 ). It is 
immediate that P is a poset, and that a maximal element in it will furnish us with an 
injective function / : Jm S for a very large subset Jm k I ■ We will then show 
that necessarily Jm = I , and the result will be established. 

To verify the conditions of Zorn’s Lemma, notice first that P is not empty. Indeed, 
the pair (0, 0) is in P. Next, let {(J , , f)}teT be a chain in P, for which we must now 
find an upper bound. Let 

teT 

and notice that we may define /:/—»■ S as follows. Given x e J there is some 
teT such that x e J t , and so let fix) = f, (x ) . To see that this is a well-defined 
function, i.e., that it is independent of the choice of teT, note that if x e J,', then 
either J, C J r i or J,i C J, and then either f,/ extends f t or f extends /,», and in either 
case fix) = ftfx). It is clear that J t C / and that / extends f t , for all teT, so all 
we need to do in order to show that (J, f) is an upper bound for the chain is establish 
that ( J , f)eP. Clearly, / c /, and /:/—>■ 5 is injective (since we saw that in 
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fact for any two X| , X2 e J the function / agrees with some f , , which is injective, so 
f(x i) = f(x 2) => ./r(xi) = f,(x 2) => xi = X2). So it remains to observe that 
(/ — /) U /(/) is a linearly independent set of vectors. Indeed, if 0 can be obtained 
as a non-trivial linear combination of the vectors xi, . . . ,x m e (I — J) U /(/), then, 
using the chain condition again, it follows that there exists an element t e T such 
that x\, ... ,x m e (/ — J,) U f t (J t ), which is a linearly independent set, yielding a 
contradiction. 

With the conditions of Zorn’s Lemma now verified, it follows that there exists 
a maximal element ( Jm , /m), with Jm c I and /m : Jm S an injection. If 
Jm = /, then we are done, since />/ : / — > .S’ is the required injection. Assuming 
this is not the case, let x\ e / — If we can find a vector yj e S — /w ( Jm ) such that 

(/ - (J M U {*,})) U U {y,}) 

is linearly independent, then we will have that (Jm U {jo}, f ) e P , where f is the 
extension of fu given by f (x<) = yi. But that would contradict the maximality of 
(Jm, 1 m ) , and we will have our contradiction. So, we proceed to prove the existence 
of such a vector y\. If no such y\ exists, then that means that the set 

(/ - Um U {xi})) U (/m(Jm) U {y}) 

is a linearly dependent set for every y e S — fM (J m)- Now, since S is a spanning 
set we may write 

X\ = a s ' s = “« • 5 + a s ■ S = X 1+X2 

s€S sZ-JmUm) 


where the sum is a finite sum, so that a s = 0 for all but finitely many .v, and we simply 
split the sum according to whether or not s e fM ( J M ) ■ By our assumption, the set 

(I - (Jm U {xi})) U fM (Jm) 

is linearly dependent if any of the vectors s e S — ^m(Jm) is added to it. Thus, 
this set is linearly dependent if any linear combination of such vectors, such as X2, 
is added to it. Further, since x\ is in the span of fu(JM) it follows that the set above 
is linearly dependent if x — x\ + X2 is added to it. But the latter is the set 

(I - (Jm U {xi})) U /m(Tm) U [x \ } = (I - Jm) U 

which is linearly independent. The proof is now complete. □ 

We are now in position to establish the following important result. 

Theorem 2.2 Let V be a linear space. If B\ and Ih are any two bases for V, then 
they have the same cardinality. 
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Proof By definition of basis, B[ is linearly independent and IP is a spanning set. 
By Lemma 2.1, there is an injective function / : B\ —> 82 - By the same argument 
there is also an injective function g : ZL — »■ B\ . It now follows from the Cantor- 
Schroder-Bernstein theorem (Theorem 1.1 in the Preliminaries) that the cardinalities 
of B\ and Ih are equal. □ 

Definition 2.3 The dimension of a linear space V, indicated by dim(V), is equal to 
the cardinality of a basis for it. Notice that by Theorem 2.1, at least one basis exists, 
and by Theorem 2.2 all bases have the same cardinality. Thus, the notion of dimension 
is well-defined. The linear space V is said to be finite dimensional if its dimension is 
finite, and infinite dimensional otherwise. A basis for an infinite dimensional linear 
space is also referred to as a Hamel basis. 

Example 2.15 Finite dimensional linear spaces, by Examples2.1 1 and 2.14, include 
the spaces R" and C", having dimension n, and the space P n of polynomials (see 
Example 2.6), which is of dimension n + 1 . Infinite dimensional linear spaces include, 
due to Examples 2.12 and 2.14, the space coo of sequences that are eventually 0, the 
space P of all polynomials with real coefficients, and the space C(7, R) of continuous 
function x : I — > R, as long as I does not reduce to a point. 

We have already encountered linear spaces of very large dimension. To argue 
about the dimension of such spaces (and in general) the following result, interesting 
on its own, is useful. Recall from Example 2.5 the linear space (K B ) 0 of all functions 
x : B — > K satisfying x(b) = 0 for all but finitely many b e B. 

Proposition 2.3 Let V be a linear space and B a basis for it. Then every vector 
x € V can be expressed uniquely as a linear combination of elements from B. In 
other words, there is a bijective correspondence between the vectors x e V and the 
elements of (K B ) 0 . 

Proof Since a typical element in (K b )q is nothing but a function x : B K for 
which x(b) = 0 for all but finitely many b e B. we may associate with each such x the 
vector XfisB ^ ( as th e summation is finite). We claim that this correspondence 
is the desired bijection. Indeed, it is a tautology that the surjectivity of this process 
is precisely the claim that B is a spanning set, while it is almost a tautology that the 
injectivity of the process is the claim that B is linearly independent. □ 

We thus see that a basis B endows a linear space with a notion of coordinates. 
This is certainly a useful thing to have, but for practical reasons it is only as useful as 
the ability to explicitly describe the basis B. In finite dimensional linear spaces it is 
very common to work with bases but, as we remarked earlier, in infinite dimensional 
linear spaces being able to explicitly describe a basis is the exception rather than the 
rule. Consequently, Hamel bases in infinite dimensional linear spaces are generally 
used for theoretical rather than practical purposes. 

Example 2.16 Recall from Example 2.8 that R may be viewed as a linear space over 
Q. Suppose that is a Hamel basis for R over Q. By Proposition 2.3, there is then 
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a bijection between R and the set (<Qf )o of all functions SB —>■ Q which attain 0 
at all but finitely many arguments. In other words, |R| = |(<Qf )ol (see Sect. 1.2.11 
of the Preliminaries for the basics of cardinalities). If SB is countable, then it is 
easy to write (Q'^)o as a countable union of countable sets (refer to Sect. 1.2.13 
from the Preliminaries for the relevant material) and thus the set (Q'^)o itself would 
be countable. But that would imply that R is a countable set while the reals are 
well-known to be uncountable. We conclude that R, as a linear space over Q, is 
infinite dimensional of uncountable dimension. 

We close this section by illustrating a difference between finite dimensional linear 
spaces and infinite dimensional ones. 

Proposition 2.4 The cardinality of any linearly independent set A in a linear space 

V is a lower bound for the dimension of V . Moreover, ifV is finite dimensional, then 
any linearly independent set A C V whose cardinality is equal to the dimension of 

V is a basis. 

Proof The dimension of V is the cardinality of any basis B, and as B is in particular 
a spanning set, it follows from Lemma 2.1 that there is an injection A —>■ B, and 
so the cardinalities satisfy |A| < B\ (refer to Sect. 1.2.11 of the Preliminaries, if 
needed). The claim now follows since the latter is the dimension of V. 

Now, if V is finite dimensional, say of dimension n, and A — {;ci, . . . , x n } is 
a set of n linearly independent vectors, then to show A is a basis we just need to 
prove that it is a spanning set. But if it were not, and y e V is any vector not in its 
span, then the set {xi, . . . , x n , y) is linearly independent and contains n + 1 vectors. 
But then, by the first part of the proposition, it would follow that n + 1 < n, an 
absurdity. 


□ 

Remark 2.6 The finite dimensionality assumption is crucial. For instance, in the 
linear space coo of sequences which are eventually 0 (Example 2.4), consider the 
vectors {xk}k>\ where Xk = (0, . . . , 0, 1, 0, . . .), with 1 in the A:-th position. These 
vectors are easily seen to be linearly independent and spanning, thus they form a 
basis of countably many vectors. The dimension of coo is thus infinitely countable. 
The set {x 2 , X 3 , . . .} is clearly also linearly independent, has the same cardinality as 
the dimension of cqo . but it is not spanning, and thus not a basis. 

Exercises 

Exercise 2.11 Let V be a linear space and .S', S' C V two arbitrary sets of vectors 
with S C S'. Prove that if S' is linearly independent, then so is S, and prove that if 
S is a spanning set, then so is S' . 

Exercise 2.12 Let V be a finite dimensional linear space and S a spanning set. Prove 
that S can be sifted to give a basis, that is, show that there exists a subset S' C S 
such that S' is a basis for V. 
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Exercise 2.13 Let V be a linear space (not necessarily finite dimensional) and S a 
spanning set. Prove that S can be sifted to give a basis, that is, show that there exists 
a subset S' C S such that S' is a basis for V . 

Exercise 2.14 Consider R as a linear space over Q and let a be a transcendental 
number. Prove that the set {1, a, a 2 , a 3 , . . .} is linearly independent. Prove that it is 
not a spanning set by showing that its span is a countable set. 

Exercise 2.15 Consider R as a linear space over Q. Prove that the dimension of R 
over Q is |R|, the cardinality of the real numbers. 

Exercise 2.16 Let V be a linear space and S C V a set of vectors. For a scalar 
a e K, let us write 


aS = {ax \ x G 5}. 

Assuming that a ^ 0 is fixed, prove that S is linearly independent (respectively 
spanning, a basis) if, and only if, a S is linearly independent (respectively spanning, 
a basis). 

Exercise 2.17 Let V be a linear space and {x/i};tgN countably many vectors in V . 
For all m e N, let 


m 

y,n = y.Xfc. 
k=l 


Prove that is linearly independent (respectively spanning, a basis) if, and 

only if, {VjtJigN i s linearly independent (respectively spanning, a basis). 

Exercise 2.18 Let B be an arbitrary set and K a field. Consider the linear spaces 
K b and (K b )q. For every bo e B let xi, n : B -+ K be the function 

[l if b = b 0 , 
xb 0 (b) - Q otherwis£ 

Prove that the set {xh tl \b<,cB is linearly independent in K 11 . Is it a basis for K B 1 Is it 
a basis for (K B ) o? 

Exercise 2.19 In the linear space C (R, R), find three vectors x, y, z : R — > R such 
that {x, y, z} is a linearly independent set while {x 2 , y 2 , z 2 ) is linearly dependent. 
(Here x 2 refers to the function given by x 2 (t) = (x(f)) 2 , and similarly for the other 
functions.) 

Exercise 2.20 Consider the space c o of sequences of complex (or real, if you like) 
numbers that converge to 0. How many linearly independent vectors can you find in 
co? 
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2.3 Linear Operators 

The definition and properties of the structure preserving functions between linear 
spaces form the topic of this section. The definitions (including that of a linear 
isomorphism) and the results are discussed and exemplified in the context of the 
linear spaces introduced above. 

Definition 2.4 Let V and W be linear spaces over the same field K. A function 
T : V — > W is said to be additive if 

T(x + y) = T(x) + T (y) 

for all vectors x, y e V, and it is said to be homogenous if 

T ( ax ) = uT(x) 

for all vectors x e V and all scalars a e K. If T is both additive and homogeneous, 
then it is called a linear operator, a condition equivalent to the equality 


T{a\X\ + a 2 x 2 ) = ot\T(x{) + a 2 T(x 2 ) 
for all xi, x 2 e V and a \ , a 2 e K. 

Remark 2. 7 Synonymous terms for linear operator are linear transformation and 
linear homomorphism. Throughout the book we will adopt the convention that any 
reference to a linear operator T : V -+ W immediately implies, implicitly at times, 
that V and W are linear spaces over the same field K. 

It is a straightforward proof by induction that a linear operator T preserves any 
linear combination, that is 


m m 

nZ a k*k) = y'ukT(xk) 

k = 1 k= 1 


for all scalars oq , . . . , a m e K and vectors x\, ... ,x m e V. In fact, this property 
can equivalently be taken as the definition of linear operator. The next result shows 
that the linearity requirement of a linear operator forces any linear operator to also 
respect the zero vector, additive inverses, and subtraction. 

Proposition 2.5 For a linear operator T : V — > W: 

1. T( 0) = 0. 

2. T (—x) = —T(x) for all x e V. 

3. T (x — y) = T (x) — T (y) for all x, y e V. 
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Proof 

1. Notice that T(0) + T (0) = T (0 + 0) = T (0), and now subtract T (0) from both 
sides of the equation. 

2. T(x) + T(-x) = T(x + (-X )) = 7\0) = 0. 

3. T(x -y)= T(x + (-y)) = T(x) + T(-y) = T(x) - T(y). □ 

2.3.1 Examples of Linear Operators 

Trivial, but important, examples of linear operators are the following. For any linear 
space V, the identity function id : V —> V is always a linear operator. On the other 
extreme, given linear spaces V and W over the same field, the function To : V — > W 
given by Tq(x) = 0 is also immediately seen to be a linear operator. 

In the presence of a basis for the domain V, linear operators T : V —> W are 
easily characterized by the following result, readily yielding an endless supply of 
examples. 

Lemma 2.2 Let V and W be linear spaces over the same field K, and let B be a 
basis for V. It then holds that any function F : B W extends uniquely to a linear 
operator Tp : V — »■ W. 

Proof For a given function F : B — > W, suppose that Tp : V — > W is a linear 
operator extending F. Then, given any vector x e V, write 

x = ab ■ b 
beB 

as a (finite) linear combination of basis elements, and then it follows that 

T F (x) = T f ( £ a b ■ b) = • Tp(b) = • F(b). 

beB beB beB 

Noticing that the computation above expresses T (x) in terms of vectors of the form 
F(b) we conclude that an extension Tp, if it exists, is unique. If we now take the 
equality above as a definition, then we obtain a function Tp : V — »■ W (relying on 
the uniqueness of the linear combination expressing x in order to assure that Tp is 
well-defined). Verifying that this Tp is indeed a linear operator and that it extends F 
follow immediately. □ 

Remark 2.8 The linear operator Tp : V — * W constructed from the given function 
F : B — > W above is said to be obtained by linearly extending F to all of V. 

Often enough the technique of the last result is inadequate, either because a basis 
is not readily available or because a more direct formula for the linear operator is 
obtainable. The following examples illustrate these possibilities. 

Example 2.17 The derivation operation d/dt , taking a function / to its derivative 
df/dt, satisfies the well-known properties 
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d 

dt 


</ + 8 ) =^< /) + ^ < S) 


and 


d 

dt 



namely, it is additive and homogenous, and thus, if we choose the domain and 
codomain correctly, we expect it to be a linear operator. To turn this observation 
into a precise statement, recall the linear spaces C k ([a , b\, R) from Example 2.7 
of all functions x : [a, b] — > R with a continuous k-th derivative. Then, for every 
k > 1, the observation above can be stated by saying that d/dt : C k ([a, b], R) — > 
C k ~ { ([a, b], R) is a linear operator. Similarly, d/dt : C°°([a,h],R) — > C°° 
{[a , b] , R) is a linear operator on the linear space of infinitely differentiable functions. 

Example 2.18 The integral operator f* dt f(t ) is also well known to be additive and 
homogenous, and, since every continuous function on a closed interval is integrable, 
we obtain that for every interval I — [a, b] the function 


b 



a 


is a linear operator from the linear space C(I, R) of Example 2.7 to R as a linear 
space over itself. 

Example 2.19 Spaces of infinite sequences, such as R°° from Example 2.3 or the 
space co from Example 2.4 admit the following operators (we use R°° just to fix one 
possibility). The shift operator S : R°° — »■ R°°, given, for x = (jcj , * 2 , . . . , Xk . . .), 
by S(x) = (x 2 , X 3 , ,Xk, .. .), is easily seen to be a linear operator. It is equally 
easy to see that the function T : R°° — »■ R°°, given for x = (x\, X 2 , ■ ■ ■ , Xk . . .) 
as above by T(x) = (0, x\, X 2 , ■ ■ ■ , Xk, ■ ■ .) is a linear operator. Note that S o T is 
the identity while T o S is not, a phenomenon that is known to be impossible in 
finite dimensional linear spaces. These operators are the creation and annihilation 
operators a * and a, widely used in Quantum Mechanics. 

We see thus that even when a basis can be given explicitly (and certainly when it 
cannot) it may be much simpler to define a linear operator directly rather than by 
linear extension on a basis. 


2.3.2 Algebra of Operators 

With suitable domain and codomain, linear operators can be composed or added. We 
now investigate the resulting algebraic laws related to these operations. 
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Proposition 2.6 The composition S o 7 : U W of any two linear operators 
T S 

U >■ V — — ► W ( where in particular all linear spaces are over the same field) 

is a linear operator. 

Proof The additivity of S o T follows from the definition of the composition and the 
computation 


S(T (x + y)) = S (T(x) + T(y )) = 5 (T(x)) + S ( T(y )), 

valid for all vectors x,y e U, utilizing the additivity of T and S. The homogeneity 
of the composition follows similarly and the reader is invited to fill in the details. □ 

Next, any two functions T, S : V — > W between linear spaces over the field K can 
be added to give rise to the function T + S : V — > W given by 

(T + S)(x) = T(x) + S(x ) 

for all x e V . Further, given a scalar a e K , the function aT : V W is given by 

(aT)(x) = a ( T(x )) . 

Proposition 2.7 For all linear operators T, S : V — >■ W and scalars a e K, both 
T + S and aT are linear operators. 

Proof For all vectors x, y e V and scalars ft. y e K: 

( T + S) (fix + yy) = T(fix + yy) + Sifix + yy) 

= T (fix) + T(yy) + Sifix) + S(yy) 

= fT(x) + yT(y) + fS(x) + yS(y) 

= P (T(x) + Six)) + y(T(y) + S(y)) 

= P(T + S)(x) + y(T + S)(y). 

The proof for aT is similar and left for the reader. □ 

The result above shows that the set of all linear operators T : V — > W has naturally 
defined notions of addition and scalar multiplication. It is a pleasant fact that with 
these operations one obtains a linear space. 

Definition 2.5 Given linear spaces V and W over the same field K , we denote by 
Hom(y, IT) the set of all linear operators T \ V — > W. 

Theorem 2.3 For linear spaces V and W over the same field K, the set Hom(F, W), 
when endowed with the operations of addition and scalar multiplication as above, 
is a linear space over K. 
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Proof The proof is a straightforward verification of the linear space axioms, so we 
only give the details for a few of the axioms. For instance, given linear operators 
T\, T 2 : V — * W, to show that 


Ti + T 2 = T 2 + Ti 


we note that 

+ T 2 )(x) = Tfx) + T 2 (x) = T 2 (x) + Ti(x) = (T 2 + Ti)(x) 

where the commutativity of vector addition in W was used. 

To show the existence of an additively neutral element, recall that Tq : V — »■ W 
given by Tq(x) = 0, is always a linear operator, and thus Tq e Hom( V, W). Seeing 
that Tq is additively neutral is simply the computation 

( T + Tq)(x) = T(x) + Tq(x) = T(x) + 0 = T(x). 

We leave the verification of the other axioms to the reader. □ 


2.3.3 Isomorphism 

The linear spaces R" +1 and P n , while consisting of very different elements, are, in 
a strong sense, essentially identical. It is obvious that one may think of a sequence 
of n + 1 real numbers as the coefficients of a polynomial, and, vice versa, one may 
identify a polynomial function with its list of coefficients and thus obtain n + 1 real 
numbers. Thus, one may rename the elements in one space to obtain the elements of 
the other, and, moreover, the linear structure under this renaming is respected. This 
situation is made precise, and generalized, by the concept of isomorphism. 

Definition 2.6 A linear operator T : V —*■ W which, as a function, is a bijection 
is called a linear isomorphism or (simply an isomorphism if the linear context is 
clear). When T : V W is an isomorphism, the spaces V and W are said to be 
isomorphic, denoted by V = W. 

The following result establishes some expected behaviour of isomorphisms. 

Proposition 2.8 Suppose that U, V, and W are linear spaces over the same field K 
and consider the linear operators T : U — > V and S : V — > W. The following then 
hold. 

1. The iden tity function i d : V — >■ V is an isomorphism. 

2. IfT and S are isomorphisms, then so is S o T . 

3. IfT is an isomorphism, then so is the inverse function T~ l . 
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Proof 

1 . We already noted that the identity function is always a linear operator. It is clearly 
bijective and thus an isomorphism. 

2. We already noted that the composition of linear operators is a linear operator, and 
thus S o T is a linear operator. In general, the composition of bijective functions 
is again bijective, and so if T and S are isomorphisms, then so is So T . 

3. Since T : U —> V is an isomorphism, the inverse function T~ l : V -» U exists, 
and is clearly a bijection. To conclude the proof it remains to be shown that T~ l 
is a linear operator. Indeed, T~ l is additive since 

T~\x + y) = T-\T(T~\x)) + T(T~ l (y))) 

= T-\T{T-\x) + T~\y))) 

= T-\x) + T~\y) 

for all vectors x, y e V. Further, T~ l is homogenous since 

T~\ax) = T~ l (aT(T~ l (x))) = T~\T (aT~ { {x))) = aT~\x) 

for all vectors x e V and scalars a e K. □ 

Corollary 2.1 For all linear spaces U, V, and W over the same field K: 

1. V = V. 

2. IfU = V, then V = U. 

3. IfU =V andV =W, then U=W. 

Isomorphic linear spaces are essentially identical, except (possibly) for the names 
of the elements in them and they thus possess exactly the same linear properties. 
This is a somewhat vague statement but it is almost always immediate how to turn 
it into a precise statement. For instance, the dimension of a linear space is a linear 
property and thus any two isomorphic linear spaces have the same dimension, as we 
now show. 

Proposition 2.9 IfV=W, then both linear spaces have the same dimension. 

Proof By assumption there exists a linear isomorphism T : V — » W. The dimension 
of a linear space is the cardinality of any of its bases, and thus the result will be 
established by exhibiting a bijection between a basis for V and a basis of W. Let 
B C V he a basis and consider its image T ( B ) C W . Clearly, T \ u . the restriction of 
T to B, is a bijection between B and f(B), so all that remains to be done is to show 
that f(B ) is a basis for W , namely that f (B) is a spanning set of linearly independent 
vectors. Let y e W be an arbitrary vector. Since T is onto, we may write y = T(x ) 
for some x e V . As B is a basis, we express x as a (finite!) linear combination of 
basis elements: 


2.3 Linear Operators 


47 


x = 


y ub ■ b. 

beB 


But then 


y = T(x) = y a b ■ T (b) 
beB 

and thus y is in the span of T(B). As y was arbitrary we conclude that T ( B ) spans 
all of W. The proof that T ( B ) is also linearly independent follows a similar pattern 
and is left for the reader. □ 

The converse to this result is also true, as the following result implies. 

Theorem 2.4 If V is a linear space and B is a basis for it, then V = (K b )q. 

Proof In Proposition 2.3 we established that the function T : (K b )q V given by 

T(f) = Y J f(b)-h 

beB 

is a bijection. We show now that it is in fact an isomorphism, thus completing the 
proof. We need to verify that T (ct\f + c( 2 g) = oi\T(f) + ajT (g), which amounts 
to showing that 

X/«i/ + CH2 g)(b) ■ b — u\ y f(b) ■ b + a 2 ^g(b) ■ b, 

beB beB beB 

which follows by an immediate computation. □ 

Remark 2.9 Every linear space V over K is thus isomorphic to a linear space of the 
form (K b ) o- The latter space is clearly only dependent on the cardinality of the set 
B, i.e., if X is any set with the same cardinality as B, then (K R )u = (K x ) o. In other 
words, the dimension of a linear space determines it up to an isomorphism. It should 
be noted however that generally there is no canonical choice for an isomorphism 
between V and ( K x )q . 

In the finite dimensional case we obtain the following corollary. 

Corollary 2.2 Let V be a finite dimensional linear space over the field K. Any 
choice of a basis x\, . . . , x n for V gives rise to an isomorphism V — » K" . In this 
way we may identify every vector in V, by means of coordinates, with an n-tuple of 
scalars. 

In a more general context, and in particular in the infinite dimensional case, where 
bases are typically not explicitly available, we have the following results. 

Corollary 2.3 If V and W are linear spaces over the same field K and they have 
the same dimension, then they are isomorphic. 
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Proof By Remark 2.9, if we let X be a set of cardinality equal to the common 
dimensions of V and W, then we obtain an isomorphisms T\ : (K x ) o — > V and 
an isomorphism 73 : (K x ) o — > W . It follows that To o Tf 1 : V — » W is an 
isomorphism, showing that V = W, as claimed. □ 

Combining Proposition 2.9 and Corollary 2.3, the discussion above is summarized 
as follows. 

Theorem 2.5 Two linear spaces over the same field are isomorphic if, and only if 
they have the same dimension. 

Remark 2.10 This result is an example of what is known as a rigidity phenomenon. 
Rigidity refers to a situation where two structures are essentially the same given 
that they have essentially the same substructure of some kind, typically of a much 
coarser nature than the original structures. In this case, the rigidity of linear spaces 
over a fixed field K is that the dimension, i.e., the cardinality of a basis, suffices to 
determine the linear space, up to an isomorphism. 

Example 2.20 Recall the linear space P„ of polynomial functions from Example 
2.6. We can easily show that P n = R' ,+ l by constructing an isomorphism between 
the two spaces. Indeed, referring here to a typical element in R ,!+ 1 by (ao, . . . , a n ), 
let 


T : R' ,+1 -* P„ 


be given by 


T {a o , . . . , a fl ) — an t n T • • • -f- a \t T ao . 


It is a trivial matter to verify that T is a linear operator, clearly bijective, and thus an 
isomorphism. Similarly, the reader is invited to show that coo = P ■ By the discussion 
above, the existence of an isomorphism (but not any explicit isomorphism) could 
have been deduced by noting that both R" +l and P n , as linear spaces over R, have 
dimension n + 1 . Similarly, both coo and P have countably infinite dimension and 
are thus isomorphic. 

Remark 2.11 Theorem 2.5 tells us that the dimension of a linear space essentially 
is all that one needs to know about a linear space in order to study it (since after 
all, isomorphic linear spaces are essentially the same). It is thus tempting to, once 
and for all, choose a single representative from each isomorphism class of linear 
spaces. This certainly suffices for the study of all linear spaces, and seems more 
economical than having all of this redundancy in the form of multiple isomorphic 
specimens of linear spaces. However, the example above clearly shows the danger 
in this approach. While R" +1 and P n are isomorphic, they have different qualities 
(at least to us humans). For instance, it is very natural to consider the integral as 
an operator on polynomials, but not so much on elements in R" +1 . The richness of 
having many different linear spaces, even if they are isomorphic, is a blessing that 
should not be given up for the sake of a more economical treatment. 
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Exercises 

Exercise 2.21 Consider the linear space c of all convergent sequences of complex 
numbers (or real numbers, with the obvious adjustments). Prove that the assignment 

{ X m }m> 1 I ^ litn X m 
m—too 


is a linear operator from c to C. 

Exercise 2.22 Finish the proof of Theorem 2.3 showing that Hom(V, W) is a linear 
space. 

Exercise 2.23 Let V and W be finite-dimensional linear spaces over a field K , of 
dimensions n and in respectively, and let U\ = {vi , . . . , v„} and B 2 = {wi, ... , w m } 
be fixed bases for V and W respectively. For any vector x e V, we write 

M/?, = (ai, . . . , a„) e K n 

where a \ , . . . ,a n are the unique scalars such that 

n 

x = ~y"a k v k . 

k= 1 

The tuple [x]g l is called the vector of coordinates of x in the basis H\ . Similarly, one 
defines [y]s 2 for all y e W. 

1. Given a linear operator T : V — »■ W, prove that there exists a unique matrix 
[T]bi.b 2 e M n m {K) such that 

[T(x)]b 2 = [T]b u b 2 ■ Mbi 

where on the right-hand-side the ‘ • ’ stands for the ordinary product of a matrix 
by a vector. The matrix [T]b 1 .b-, is called the representative matrix of the linear 
operator T in the given bases. 

2. Prove that the assignment T i->- \T\[ jt .a-,, mapping a linear operator to the matrix 
representing it in the two given bases, is a linear isomorphism 

Hom(F, W) -+ M nm (K). 

3. Suppose now that U is a third linear space with its chosen basis /L, and that 
S : W —*■ U is a linear operator. Prove that [ S o T]b 1 ,b 2 = [S'j/L./L ■ \T]b 1 ,b 2 , 
where the product on the right-hand-side is the ordinary product of matrices. 

We remark that this exercise gives one justification for the definition of the algebraic 
operations on matrices the way they are defined. 
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Exercise 2.24 Prove that R" and C'\ n > 1, when considered as linear spaces over 
R, are not isomorphic. 

Exercise 2.25 Prove that M°° = C°°, when considered as linear spaces over R. 

Exercise 2.26 A linear operator T : V — > V is nilpotent if there exists an m > 1 
such that T m , the m-fold composition of T with itself, is the zero operator 0 : V — > V. 
Prove that if T : V — > V is nilpotent, then idy — T is an isomorphism. 

Exercise 2.27 Let T : V — »■ V be an operator with the property that for all x e V 
there exists an m > 1 such that T m (x) — 0. Prove that if V is finite-dimensional, 
then T is nilpotent, but if V is infinite-dimensional, then T need not be nilpotent. 

Exercise 2.28 Prove that for every set SB there exists a linear space with SB as a basis 
(and consequently for any cardinality k there exists a linear space whose dimension 
is k). 

Exercise 2.29 Let / : R — > R be a continuous function with f(x + y) = f(x) + 
f(y) for all x, j el. Prove that there exists some a e R such that f(x) = ax for 
all rel. 

Exercise 2.30 Use a Hamel basis for R as a linear space over Q to construct a 
function / : R — >■ R satisfying f(x + y) = f(x) + f(y) for all x, y e R, which is 
not of the form f(x) = ax for any a e R. 


2.4 Subspaces, Products, and Quotients 

Subspaces arise naturally as portions of a given linear space that inherit a linear space 
structure from the ambient linear space. We show that any linear operator gives rise 
to certain subspaces, its kernel and its image, and we show how to combine two 
linear spaces to form their product space. We also show how to eliminate a subspace 
so as to obtain a quotient space. The quotient construction is related to the concept 
of complementary subspaces, which is also introduced and the connection made 
explicit. 


2.4.1 Subspaces 

Given a subset Ac V of a linear space V the operations of addition and scalar 
multiplication, when restricted to vectors in A always yield elements in the ambient 
space V. We are interested in the case where the results of these operations always 
yield elements in A. 

Definition 2.7 A subset A c V is said to be closed under addition if 


x + y e A 
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for all vectors x , y e A . Similarly, A is said to be closed under .scalar multiplication if 

ax e A 

for all vectors x e A and scalars a e K. The set A is called a linear subspace (or 
simply subspace if the linear context is evident) of V if A ^ 0 and A is closed under 
addition and scalar multiplication. A subspace A is called a proper subspace if it is 
properly contained in the ambient space V. 

Remark 2.12 If A is a linear subspace of V , then A, with vector addition and scalar 
multiplication induced from V, is a linear space. Indeed, the verification of each of 
the axioms follows the same pattern. For instance, 

x + y = y + x 

holds for all x, y e A since the same equality holds for all vectors x and y in the 
ambient space V . Moreover, notice that the zero vector 0 is always an element in A 
and is the zero vector of A. Indeed, 0 = 0 • x holds for all x e A, and since A is 
required to be non-empty, at least one x e A exists. 

Notice that the subset {0} consisting of just the zero vector is always a subspace, 
called the trivial subspace of V. Another immediate example of a subspace of V is 
V itself, i.e., a non-proper subspace. An immediate property of subspaces is their 
transitive property, that is if A c B c V, then if B is a subspace of V and A is a 
subspace of B, then A is a subspace of V. 

Example 2.21 Consider the linear space R 2 and its subset A consisting only of the 
vectors of the form (x, 0) e M 2 . Clearly, A is non-empty, closed under addition, and 
closed under scalar multiplication, and hence is a subspace of M 2 . More generally, 
a line / = {tx \ t e M}, with x e R 2 a non-zero vector, is a linear subspace of 
M 2 . These linear subspaces, together with the subspaces {0} and M 2 , exhaust all of 
the linear subspaces of R 2 . Similarly, linear subspaces of R", for larger values of n, 
correspond to the origin, to lines through the origin, to planes through the origin, and 
to hyper-planes through the origin. Notice the following subtle point. The held R, 
when viewed as a linear space over itself, is obviously isomorphic to R 1 , that is to 
R" where n = 1. However, even though the linear space R = R 1 may naturally be 
identified with either the X-axis or the Y-axis in R 2 , formally speaking, R itself (or 
R 1 ) is not a subspace of R 2 . In fact, it is not even a subset of R 2 . More generally, R" 
is a subspace of R m if, and only if, n = m, in which case the two spaces coincide. 
When n < m the space R" may be identified in finitely many ways with various 
subspaces of R m , but that is a different story. An analogous discussion can be given 
for linear spaces over C. 

Example 2.22 Consider the linear space C°° from Example 2.3 and the linear spaces 
c of convergent sequences, co of sequences that converge to 0, and coo of sequences 
that are eventually 0 (see Example 2.4). One clearly has that 
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Coo C C 0 C C C C°° 

and in fact coo is a linear subspace of c o, which in turn is a linear subspace of c, 
which is a linear subspace of C°°. The verification is immediate. 

Other subspaces of C°° include, for each n > 1, the set of all vectors of the form 
(xi , . . . , x n , 0, 0, . . .) . This space is clearly isomorphic to C" , as are infinitely many 
other subspaces of C°°, as the reader is asked to verify. In any event, C" is not itself 
a subspace of C°°. 

Example 2.23 Consider the linear space P n (Example 2.6) consisting of polynomial 
functions with real coefficients of degree not exceeding n, and P, the linear space 
of all polynomials with real coefficients. For all n < m, it holds that P n C P m C P, 
and it follows easily that P n is a subspace of P m , which in turn is a subspace of P. 
We thus obtain the tower of proper subspace inclusions 

Po C Pi C P 2 C • • • C P„ C • • • C P. 

Example 2.24 With reference to Example 2.7, the subset of C ([a, b\, R) consisting 
of the continuous functions x : [a , b] — > R which vanish at a and b. that is such that 
x(a) = x(b) = 0, constitutes a linear subspace. On the contrary, the subset of those 
continuous functions with x(a) = x(b) = 1 does not constitute a linear subspace 
(for instance since it fails to contain the zero vector). 

The following family of linear spaces, which are subspaces of C°°, is of significant 
importance. 

Example 2.25 (The l p Spaces). Consider the linear space C°° from Example 2.3 
and recall that a sequence x = (x \ , . . . , Xk , . . .) e C°° is said to be bounded if there 
exists some Me R such that 

1**1 < M 

for all A: > 1. Let l 0 0 C C°° be the set of all bounded sequences. Clearly, the 
zero vector, namely the constantly zero sequence, is in f oo, the sum of two bounded 
sequences is again bounded, and thus in l Q c , and if x is bounded by M, then ax = 
(axi, . . . , axk, . . .) is bounded by \a\M, and thus is in l 0 c . It thus follows that l a 0 
is a linear subspace of C°°. 

Now, let p > 1 and consider the set l p of all sequences of complex numbers that 
are absolutely p-power summable, that is, all sequences x e C°° such that 

OO 

y. \xk\ p < oo. 

*= t 

Clearly, the constantly zero sequence belongs to l p , and l p is evidently closed under 
scalar multiplication. To show that l p is a subspace of C°° it thus remains to be 
shown that l p is closed under addition, a task we leave to the reader. 
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We thus obtain a one-parameter family {Tph<p<oo of linear subspaces of C°°. 
In fact, it is easily seen that if 1 < p < q < oo, then l p C l q (consider a suitable 
variation of the harmonic series), and so i p is a proper linear subspace of l q . 

Example 2.26 Recall from Example 2.8 that R may be viewed as a linear space over 
R, and that C can be viewed as a linear space over C or as a linear space over R. 
The set inclusion R C C is of course always valid, but R is a linear subspace of C 
only when both are considered to be linear spaces over R (or over Q). A subspace 
relation between two linear spaces can only hold if both linear spaces are over the 
same field, and so, for instance, C" as a linear space over R is not a linear subspace 
of C" as a linear space over C. 

We now address the relationship between subspaces and dimension which, in the 
finite case, is as one would expect. But the infinite case, as usual, is more subtle. 

Theorem 2.6 For a linear space V of dimension n and a subspace U C V with 
dimension m, the following hold. 

1. m < n. 

2. If V is finite dimensional and m = n, then U = V. 

Proof 

1. The proof is given in full generality, thus n and m may be infinite cardinals. Let 
By be a basis for V and let By be a basis for U. By assumption, the cardinality 
of By is n while the cardinality of By is m. Now, by definition of basis, By is a 
linearly independent set of vectors in U , and hence is also linearly independent 
in V, while By is a spanning set in V. Applying Lemma 2.1, it follows that there 
exists an injection By — * By, and thus n < m. 

2. Assume now that m = n and that V is finite dimensional, namely that n = m is 
a natural number. We may thus find a basis {jci , . . . , x n } of U . In particular these 
vectors are linearly independent in U, and thus also in V. By Proposition 2.4, it 
follows that the set {x\, . . . , x n } is a basis of V , so its span is V . But the span is 
also U , and the claim follows. 

□ 

Remark 2.13 Much as in Remark 2.6, the finite dimensionality assumption cannot 
be avoided. Indeed, the linear space coo from Example 2.4 is, as we saw, of countably 
infinite dimension but it does contain proper subspaces of the same dimension. For 
instance, it is easy to verify that the set cooo of all sequences in coo whose first term 
is 0, is a proper subspace of coo even though it is isomorphic to coo, and thus has the 
same dimension as the ambient space. 
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2.4.2 Kernels and Images 

With any linear operator T : V -* W one can associate a subspace of the domain, 
called the kernel of T, and a subspace of the codomain, called the image of T . As 
we show below, the kernel suffices to detect whether or not T is injective. 

Definition 2.8 Let T : V — > W be a linear operator. The set 
KerCT) = {* e V | T(x) = 0} 


is called the kernel of T . 

Theorem 2.7 The kernel of a linear operator T : V —>■ W is a linear subspace 
ofV. 

Proof In Proposition 2.5 we already noted that T( 0) = 0 and thus 0 e Ker(T’). All 
we need to do now is show that the kernel is closed under vector addition and scalar 
product. Indeed, if x, y e Ker (7’), then 

T(x + y)= T{x) + T(y) = 0 + 0 = 0 

and thus x + y e Ker (T). Similarly, for all a e K, if x e Ker ('/’), then 


T (ax) = aT (jc) = a ■ 0 = 0 


and thus ax e Ker (7’ ) and the proof is complete. □ 

Theorem 2.8 A linear operator T : V — * W is injective if and only if, its kernel is 
trivial, i.e., Ker(T) = {0}. 

Proof If T is injective, then since we already know that T (0) = 0, it follows that 
T(x) = 0 implies x = 0, and thus Ker (7 ) = {0}, and is thus trivial. Conversely, 
suppose that the kernel of T is trivial and suppose that 

T(x) = T(y) 


for some x,y e V. Then 


T(x - y) = T(x) - T(y) = 0 


and thus 


x — y e Ker (T). 


But then x — y — 0, and so x = y, showing that T is injective. 


□ 
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Another subspace naturally obtained from a linear operator is its image. 

Definition 2.9 Let T : V -» W be a linear operator. The set hn(7’) = (7’(x) \ x e 
V}, namely the set-theoretic image of T, is called the image of T . 

Proposition 2.10 The image of a linear operator T : V — > W is a linear subspace 
ofW. 

Proof Obviously, T (V) is not empty. Further, if x, y e ImlT), then x = T (pc') and 
y = T (y') for some vectors x', y' e V. It then follows that 

x + y = T(x') + T(y’) = T(x' + V) e Im (T). 

Similarly, one shows that ax e lm(7’) for all x e lin(7’j and all scalars a e K . □ 

The following result, which the reader may recognize as the rank-nullity theorem, 
has important consequences for finite dimensional linear spaces. 

Lemma 2.3 The equality 

dim(L) = dim(Ker(T)) + dim(Im(T)) 

holds for all linear operators T : V — > W between finite dimensional linear spaces. 

Proof We present the general strategy, inviting the reader to fill-in the details. Choose 
a basis {xi, . . . , x^} for Ker(7’). Then augment this basis by (if needed) adding 
vectors to it, so as to obtain a basis {xi, . . . , x&, y\, . . . , y m ) of V and show that 
(7’(yi), .... 7’ (y m )} is a basis for Im(T). □ 

Remark 2.14 The proof above can be adapted to obtain a similar result for infinite 
dimensional linear spaces, provided one handles cardinal arithmetic with care, of 
course. However, such a result is not of great importance and we thus avoid its details. 

Corollary 2.4 For a linear operator T : V — > W between finite dimensional linear 
spaces of equal dimension, the conditions 

• 7 is injective 

• 7 is surjective 

• 7 is bijective 

are equivalent. 

Proof We have seen that T is injective if, and only if, its kernel is trivial, namely 
precisely when dim(Ker(7’)) = 0. It follows then that T is injective if, and only 
if, dim(V) = dinif lm(7’j). But since dim (VO = dim(W), it follows by Theorem 
2.6, that dim(L) = dim(Im(T)) holds if, and only if, lm(7Q = W, in other words, 
precisely when T is surjective. This completes the verification of the non-trivial 
arguments in the proof. □ 
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2.4.3 Products and Quotients 

We now present the product construction for two linear spaces V and W over the 
same field K. We endow the set V x W with an addition operation and a scalar 
product that turn it into a linear space, as follows. Given (xi , x 2 ), (y\ , >’ 2 ) efxff, 
and a e K, the addition operation is given by 

(xi, x 2 ) + (yi ,yi) = (*1 + vi, x 2 + yz) 

and the scalar product operation is given by 

a(xi,X 2 ) = (axi,ax 2 ). 

Theorem 2.9 For linear spaces V and W over the same field K, the set V x W. 
when endowed with addition and scalar multiplication as above, is a linear space 
over the field K. 

Proof The verification of the linear space axioms is straightforward. For instance, 
the additively neutral element is (0, 0) since 

(xi, x 2 ) + (0, 0) = (xi + 0, x 2 + 0) = (xi, x 2 ) 

for all (xi , X 2 ) e V x W. We leave the rest of the verification to the reader. □ 

Obviously, one can similarly define the product V\ x • • • x V„ of any finite number 
of linear spaces, and even the product of any collection of linear spaces. It is also 
evident, upon inspection of the linear structure of the linear space R" , that R" is 
the 77 -fold product of R with itself, where R is viewed as a linear space over itself. 
Similarly, C" is the n-fold product of C with itself. 

We now turn to consider the quotient construction of a linear space by a linear 
subspace. This construction is, in some sense, the reversal of taking the product of 
linear spaces. We refer the reader to Sect. 1.2.15 of the Preliminaries for basic facts 
about equivalence relations. 

Definition 2.10 Let V be a linear space and U a subspace of it. Two elements 
xi , x 2 e V are said to be equivalent modulo U if 

xi — x 2 € U 


that is, if 


XI = x 2 + u 

for some it e U. This relation is denoted by x\ = x 2 (mod U), by x =u y, or simply 
by x = y if U is evident. 
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We now show that = is an equivalence relation on V . Indeed, for all x, y, z e V 

x — x = 0 e U, 

and thus 

x = x, 

so that = is reflexive. Further, 

x = y => x — y e U => y — x = (— 1) • (x — y) e JJ =>• y = x, 
and thus = is symmetric. Finally, 

x — y,y — zeU =+ x — z = (x — y) + (y — z) e U => x = z, 
and so = is transitive. 

Recall that the equivalence class of any vector r e Vis the set 
M = {y e V | x = y}. 

In other words, it is the set of all vectors of the form x + u, where u e U. For that 
reason the equivalence class [x] is also denoted by 

x + U. 

It follows from the general theory of equivalence relations that {x + U x e V } is a 
partition of V. 


Example 2.27 As a simple example, consider the usual 2-dimensional plane R 2 and 
its subspace Y consisting of the ordinates. Let the element r e M 2 be the vector O — P 
from the origin to the point P in the plane. The equivalence class [r]y, formed by the 
vectors of R 2 equivalent mod Y to r, is clearly given by all vectors O — P, with P, 
lying on the parallel line to the Y-axis passing through the point P; all these vectors 
are equivalent mod Y to each other. 

The fact that the entire 2-dimensional space is partitioned by all such equivalence 
classes is simply the fact that R 2 is the disjoint union of all the lines parallel to the 
Y axis, which are precisely the translates x + Y of Y. 
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We now turn to investigate the quotient set V IU = [x + U \ x e V] of equivalence 
classes modulo U. It is natural to introduce an addition and a scalar product on V IU 
by the formulas 


(y + U) + (z + U) = (y + z) + U and a(y + U) = (ay) + U, 


but we must first verify that these operations are well-defined, namely that they do 
not depend on the chosen representatives. Indeed, for all x, y, x', y' e V , we need 
to show that if 


x + U = x + U and y + U = y' + U, 


then 


(x + y) + U = O' + /) + U. 

But the former implies that 

x' — x e U and y' — y e U 

while the latter will be implied by showing that (x' + y') — (x + y) e U . And indeed, 
since U is closed under addition, 

O' + y') -0 + 3') = O' -x) + 0' -y) eU. 

A similar argument shows that the scalar product above is well-defined and thus we 
see that the quotient set VIU of equivalence classes is naturally endowed with an 
addition operation and a scalar product operation. 

Theorem 2.10 Let V be a linear space and U c V a subspace. The quotient set 
V/U of equivalence classes modulo U, with the operations of addition and scalar 
multiplication as given above, is a linear space whose zero vector is U. Moreover, 
the canonical projection tc : V — > V/U given by n(x) =x + U is a surjective linear 
operator. 
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Proof To see that U — 0 + U behaves neutrally with respect to addition, note that 
for all x + U e V/U 

(x + U) + U = (x + U) + (0 + U) = (x + 0) + U = x + U. 

The rest of the verification of the linear space axioms, as well as the claims about 
the canonical projection, follow similarly and are left for the reader. □ 

Definition 2.11 The linear space V/U constructed in Theorem 2.10 is called the 
quotient space of V with respect to U, or simply as V modulo U . 

Continuing the example above, R 2 /Y, which geometrically can be thought of as 
the space of all vertical lines in R 2 , is a linear space. Another, more complicated, 
example is co/coo, the quotient of the space co of all converging sequences of (say) 
real numbers, by the subspace of all eventually 0 sequences. The quotient may be 
thought of as the space of all sequences modulo finite changes. In analysis it is well 
known that the limit of a sequence is insensitive to finite changes in the sequence, 
and so it is precisely this quotient that is of interest already in elementary analysis. 


2.4.4 Complementary Subspaces 

Contemplating the quotient space associated to Example 2.27, one realizes that R 2 /Y 
is isomorphic to X, and that under this isomorphism, the canonical projection it : 
R 2 — »■ R 2 /Y is nothing but the projection of R 2 onto the X axis. This observation 
generalizes to any quotient space construction, as we now show. 

Definition 2.12 Two subspaces U and IT of a given linear space V are said to be 
complementary 

• if the only vector they have in common is the zero vector, in other words if t/fl W = 
{0}; and 

• if U + W — V ; that is, given x e V , there exist u e U and w e W with x = u + w. 

Remark 2.15 The decomposition above of an element xeVas the sum x — u + w 
is in fact unique. Indeed, if x = u + w and also x — u' + w' , with u,u' e U and 
w, w r e W, then 


0 = x — x = (u + w) — (u r + w') => u — u' = w' — w. 

But u — u' e U and w' — w e W , and since the only element in U fl W is the zero 
vector, it follows that u — u' = 0 = W — w, and therefore, u = u' and w = w' . This 
unique decomposition is said to express V as a direct sum of its subspaces U and W, 
a fact denoted by V = U © W. 

The following example shows that a given subspace U c V may admit more than 
one complementary subspace. 
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Example 2.28 (Continuing Example 2.27) In E 2 any straight line N passing through 
the origin and not coincident with Y is a complementary subspace, as can easily be 
verified by elementary means. 



We may now present the main theorem relating complementary subspaces and 
the quotient construction. 

Theorem 2.11 Given a linear space V and two complementary subspaces U and 
W, the quotient space V fU is naturally isomorphic to W ( the meaning of ‘naturally ’ 
is discussed below). 

Proof Recall from Theorem 2.10 that the canonical projection n : V — > VIU, given 
by 7 x(x) = x + U , is a linear operator. The result will be established by showing that 
the restriction of 7r to W C V is an isomorphism between W and VIU . All we have 
to do is show that the restriction is a bijection. For injectivity we use Theorem 2.8 
and show the kernel of the restriction is trivial. For that, suppose that tt(w) = 0 for 
some we W, that is (recall that the zero vector in VIU is U) 

w+U = U 


and so w e U. But since 

unw = { 0 }, 

it follows that w = 0, as needed. For surjectivity, let y + U be an arbitrary element 
in VIU. Writing y = u + w, with u e U and w e W, we get that 

y + U — (u + w) + U = (u + U) + (w + U) = w + U — jt(w). □ 


Remark 2.16 The meaning of the isomorphism above being natural (or canonical) 
is that its construction does not depend on any choice of basis. This is an important 
observation since, as we saw, in infinite dimensional spaces it may be impossible to 
actually describe any basis. 

Corollary 2.5 Given a linear space V and a subspace U C V, any two subspaces 
W i and W 2 that are complementary to U are naturally isomorphic. 
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Proof Each of W \ and Wi is naturally isomorphic to V/U. □ 

In the case of complementary spaces, the stated isomorphism is independent of 
a basis but it is dependent on the ability to express the ambient space as the direct 
sum of the given subspace and each of its complements. If the latter can be done 
explicitly, then an explicit formula for the isomorphism will emerge. 

Exercises 

Exercise 2.31 Let V be a linear space and XCfa subset. Prove that S is a linear 
subspace of V if, and only if, S is closed under linear combinations, i.e., if 

m 

y, oikSk e S 

k= 1 

for all at], . . . , a m e K and .v | , . . . , s m e S, and m > 0. Here we adopt the convention 
that the empty sum, i.e., the case m = 0, is equal to 0. 

Exercise 2.32 Prove Lemma 2.3: If T : V — > W is a linear operator between finite 
dimensional linear spaces, then 

dim(L) = dim(Ker(T)) + dim(Im(T)). 

Exercise 2.33 Let V be a linear space and let {S,-}; € / be a family of subspaces of 
V . Prove that the intersection 


s=n« 

iel 

is a linear subspace of V. On the other hand show that if Si , S 2 ^ V are two linear 
subspaces of V , then Si U S 2 is a linear subspace of V if, and only if, either Si c S 2 
or S 2 c Si. 

Exercise 2.34 (Refer to Example 2.25) Prove that l p , for 1 < p < 00 , is a linear 
space and that if 1 < p < q < 00 , then i p C t q . 

Exercise 2.35 Let V and W be two linear spaces over the same field K. Prove that 
V x W = W x V . Can equality ever hold? 

Exercise 2.36 Let V be a linear space and U C V a linear subspace. Prove that the 
operation 


a(x + U) — ax + U, 

for all vectors x e V and scalars a e K, is well-defined. That is, show that it is 
independent of the choice of representative for the equivalence class x + U. 
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Exercise 2.37 Let V be a linear space. Prove that V/V = 0, where 0 denotes any 
linear space with a single element, necessarily its zero vector. Show also that V/{0} = 
V , where {0} is the trivial subspace of V. Is it possible that any of these isomorphisms 
is an actual equality? 

Exercise 2.38 Let V be a linear space and U C V a linear subspace. Prove that the 
canonical projection n : V — > VIU given by 

7t(x) = X + U 


is a surjective linear operator. 

Exercise 2.39 Let V be a linear space over K and U C V a subspace. Consider the 
diagram 


o 



V/U 


where i : U — > V is the inclusion function, n : V — > VIU is the canonical 
projection to the quotient space, and V' is an arbitrary linear space over K. The rest 
of the diagram deciphers as follows: Prove that for all linear operators T : V — ► V' 
with the property that T{x) = 0 for all x e U, there exists a unique linear operator 
T' : VIU — > V' such that T' o it = T . (This exercise is establishing what is known 
as the universal property of the quotient construction) 

Exercise 2.40 Let V and W be two linear spaces over the same field K, and consider 
the product V x W. Show that 

1. U — {(x, 0) | x e V} is a linear subspace of V x W which is isomorphic to V. 
We denote U more suggestively by V x {()[. 

2. Prove that ( V x W)/(V x {0}) = W. 


2.5 Inner Product Spaces and Normed Spaces 

So far in our treatment of linear spaces two geometric aspects are missing, namely 
angles between vectors and the length of a vector. This situation is unavoidable since 
in general linear spaces it need not be possible to coherently define any of these 
notions. In the presence of extra structure, these notions become available, as we 
discuss below. 
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2.5.1 Inner Product Spaces 

The reader is most likely familiar with the standard inner product on R", i.e., 


n 



k=l 


The definition of a general inner product space is then a generalization of certain 
properties the standard inner product has. We will not pause here to motivate this 
definition any further, except for two comments. The first comment is that the results 
below will show that an inner product allows one (at least in the real case) to speak of 
angles between vectors, thus obtaining retrospective justification for the axioms. The 
second comment pertains to the non-arbitrary nature of the standard inner product; 
the reader is invited to discover it from elementary geometry, i.e., the law of cosines. 

Definition 2.13 (Inner Product Space) Let V be a linear space over K, where K is 
either R. or C. Assume that a function (— , — ) : V x V —*■ K is given. The function 
(— , — ) is said to he an inner product if for all x, y,z £ V and a. [1 e A", the following 
conditions hold. 

1. Conjugate symmetry or Hermitian symmetry, i.e., (y,x) = (x,y), where vv 
denotes the complex conjugate of w. 

2. Linearity in the second argument, i.e., (x, ay + Pz) = a(x, y) + P(x, z). 

3. The inner product is positive definite, i.e., (x,x) > 0 for al 1 x e V. and (x, x) = 0 

implies that x = 0, the zero vector. In particular, we then introduce the norm of 
x to be || jc || = (x, x). 

The linear space V together with the function (— , — ) is then called an inner 
product space. To emphasize that K = R we use the term real inner product space, 
while the case K = C is given the term complex inner product space. The scalar 
(x, y) is the inner product of the given vectors (in that order!). 

Remark 2.17 Notice that when K = R, conjugate symmetry reduces to symmetry, 
that is 


(y,x) = (x, y), 


and the inner product is also linear in its first argument, namely, 


(ax + f)y, z) — a(x, z) + P(y, z). 


In the complex case one generally has that 


(ax + P y, z) = a(x, z) + P(y, z), 


as follows easily from conjugate symmetry and linearity in the second argument. In 
both real and complex inner product spaces, the inner product is additive in each 
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argument. We mention as well that in mathematical circles it is common to write 
(jc , v) instead of ( v, x). that is to define an inner product as being linear in the first 
argument rather than the second one. Obviously, the difference is only cosmetic. 

Example 2.29 Consider the linear space M" and the definition 

n 

X ■ y = (x 1 x n ) ■ (yi y „ ) = ^ xuyk ■ 

k= l 


With this definition it is easily seen that R" becomes a real inner product space. 
Similarly, considering C”, defining 


n 

X ■ y = Oi , . . . , x n ) ■ (yi , . . . , y n ) = ^ W 

k= 1 


endows C" with the structure of a complex inner product space. Notice that for 
x, y e M°° the ‘obvious definition’ 


(X) 

*• y = (*i (yi ,-■•) = y. xkyk 

k= 1 

fails to endow K°° with the structure of an inner product space, simply because the 
sum may fail to converge, and thus this is not even a function. 

Example 2.30 Consider the space C(7, R) of continuous real valued functions on 
the interval I = [a, b]. The familiar properties of the integral show at once that 
defining, for all x, y e C(J, R), 


b 

{ x , y) = j dtx(t)y(t) 

a 

endows C(7, R) with the structure of a real inner product space. With the proper 
conjugation in the integral above, the space C(7, C) of continuous complex valued 
functions becomes a complex inner product space. 

Before we present any further examples let us explore some elementary, and not so 
elementary, general results. 

Proposition 2.11 For all vectors x, y in an inner product space V (either real or 
complex) 


Ik + y\\ 2 < IWI 2 + 2|{x, y>| + ||y|| 2 . 
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Proof Expanding ||x + y\\ 2 = {x + y, x + y) we see that 

{x + y, x + y) = (x, x + y) + (y, x + y) = {x, x) + (x, y) + (y, x) + {y, y) 
and thus the claim will follow by showing that 

(x,y) + (y, x) < 2 • |(x, y)|, 

which is immediate if V is a real inner product space. In the complex case suppose 
that (x, y) = u + iv. Then 

(x, y) + (y, x) = (x, y) + (x, y) = 2 ■ u 


and since 

\u\ = y/uf < yj u~ T" V" = \u + iv\ = |(x, y}|, 


the result follows. □ 

The geometric interpretation of the norm ||x|| is that it is the length of the vector x. 
As for the geometric meaning of the inner product, notice that any non-zero vector 
x e V can be normalized by defining x = x/||x||, a vector of unit length. For any 
vector x of unit length and an arbitrary vector y, the scalar (x, y) is interpreted as the 
component of the vector y in the direction x. Consequently, we make the following 
definition. 

Definition 2.14 Let V be an inner product space. Two vectors x, y e V are said to 
be orthogonal or perpendicular if (x, y) = 0. 

Theorem 2.12 (Pythagoras’ Theorem) The equality 

ll* + }'ll 2 = IWI 2 + llyll 2 

holds for all orthogonal vectors x and y in an inner product space. 

Proof ||x + y|| 2 = (x + y, x + y) = \\x\\ 2 + (x, y) + {y, x) + ||y|| 2 = ||x|| 2 + ||y|| 2 . 

□ 


2.5.2 The Cauchy -Schwarz Inequality 

The following inequality is among the most important inequalities in mathematics. 
It has numerous uses, some of which we will see immediately and some later on. 
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Theorem 2.13 (Cauchy-Schwarz Inequality) The inequality 

K*-:y>l < Hit II Ill’ll 

holds for all vectors x, y in an inner product space V. 

Proof For simplicity let us assume V is a real inner product space (only a small 
adaptation is needed for the complex case). The inequality is trivial when x = 0 and 
so we proceed assuming that x ^ 0. Let x = x/||x|| be the normalization of x, so in 
particular ||x|| = 1. Dividing the desired inequality by ||x|| we see that we need to 
establish the inequality 

y)| < llyll 

(and so the Cauchy-Schwarz Inequality acquires the interpretation that the absolute 
value of a component of y in a given unit direction sets a lower bound on the length 
of y). Rewriting y to identify its component in the x direction we obtain 


y = yjc + (y - ya) 


where y% = (x, y)x. Since 


{x,y- yz) = (x, y) - (x, y k ) = (x,y) - (x, y)(x,x) = 0 


it follows that yj and y — yz are orthogonal. Pythagoras’ Theorem (Theorem 2.12) 
now yields 


Ill'll 2 = II » II 2 + II V- ^11 2 > ll^l| 2 = (x,y) 2 . 


Extracting square roots completes the proof. 
Corollary 2.6 Since we now know that 


-1 < 


{x, y) 

X II Ill’ll 


< 1 


□ 


for all non-zero vectors x and y in an arbitrary real inner product space, it follows 


that 


6 = arccos 


(x, y) 

1*11 Ill’ll 


is defined. Noting that 

(x, y) = cos 0 ||x || ||y|| 

we define 6 to be the angle between the vectors x and y. 

We had just seen that a consequence of the Cauchy-Schwarz inequality is that, in 
a real inner product space, angles between vectors are a well-defined notion. There 
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are many more consequences of the Cauchy-Schwarz inequality, among which we 
mention one. The Heisenberg uncertainty principle in Quantum Mechanics is the 
result of applying the Cauchy-Schwarz inequality in a certain Hilbert space, suitably 
constructed. Unfortunately, this result falls slightly short of the scope of this book. 


2.5.3 Normed Spaces 

In an inner product space every vector can be assigned a norm. A normed space is 
then an abstraction of certain key properties of this norm. This is useful since, as will 
be shown below, many linear spaces fail to admit an inner product but do admit the 
structure of a normed space. In such spaces one may still speak of the length of a 
vector, but not (necessarily) of angles between vectors. 

Definition 2.15 (Normed Space) A linear space V with a function jc i — > ||x||, which 
associates with every vector x e V a real number ||x||, called the norm of x, is said 
to be a normed space if for all vectors x, y e V and a e K the following hold. 

1. Positivity, i.e., ||x|| > 0 provided x ^ 0. 

2. Homogeneity, i.e., || qtjc || = |a|||x||. 

3. Triangle inequality, i.e., \\x + y|| < ||x|| + ||y||. 

The following result is an immediate consequence of the axioms. 

Proposition 2.12 Let V be a normed linear space. For all vectors x, y e V: 

1. || jr || = 0 if and only if x — 0. 

2. || — xc || = ||x||. 

3 - II* - y|| < ||*|| + ||y||. 

4. |||*|| — llylll < ||x — v || . 

Proof 

1. Using homogeneity, ||0|| = ||0 • 0|| = 0 • ||0|| = 0, and together with positivity 
the claim follows. 

2. Using homogeneity, || — x|| = ||(— 1) • x|| = 1 • ||x|| = ||x||. 

3. Using (2) and the triangle inequality, ||x — y|| = ||x + (— y)|| < ||x|| + ||y||. 

4. We need to show that — ||x — v || < ||x|| — ||_y|| < ||x — v||. Using (3) and 

the triangle inequality, ||x|| = ||(x + y) — y|| < ||x + y|| + ||y|| and similarly 
||v|| < ||x|| + ||x — v || , as needed. □ 

The definition of normed space was motivated by certain intuitive properties that the 
lengths of vectors, as modeled by the norm in an inner product space, satisfy. The 
next result shows that the inner product formalism indeed gives rise to a normed 
space, as expected. 

Lemma 2.4 Any inner product space, with its associated notion of norm, is a normed 
space. 
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Proof Recall that the norm in an inner product space is given by ||x|| = f(x, x), 
and condition 1 in the definition of normed space is immediate. For condition 2, we 
have that 

|| ax || 2 = (ax, ax) — aa(x, x) = |a| 2 ||x|| 2 
and the desired equality follows. As for the triangle inequality, by Proposition 2.1 1 
II* + v|| 2 = (x + y,x + y)< ||x|| 2 + 2|(x, y)\ + ||y|| 2 
and by applying the Cauchy-Schwarz inequality we obtain that 

II* + Jll 2 < ll*ll 2 + 2||x|| ||y|| + llvll 2 = (||jc|| + ||y||) 2 , 

and the result follows. □ 

We close the section, and the chapter, by introducing two very important families 
of normed spaces. The l p spaces, introduced already in Example 2.25, are spaces of 
sequences and each carries its own norm, the l p norm. The second family of spaces, 
the pr e-L p spaces, allude to the family of L p spaces and the L p norm. The pre-L p 
spaces are, in a sense, the continuous version of the l p spaces and the L p spaces are 
the completion of the pre-L p spaces (completions are discussed in Chap. 4). 


2.5.4 The Family of l p Spaces 


For the reader’s convenience, we repeat here the definition of the l p spaces, and 
augment it with the definition of the l p norm. 

Definition 2.16 Let 1 < p < oo be a real number. The set i p consists of all 
sequences x = (jci , . . . , Xk, ■ ■ .) of complex numbers for which 

OO 

Xj Xk \ p 

k=l 

The ip-norm of x is 

( oo 

X 

k= 1 


< OO. 


1**1 


) l/p 


The foo 


space is the space of all bounded sequences x of complex numbers and the 


*lloo = sup \Xk\. 
k 


OO 


norm is 
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Note that it is immediate that ||x|| p = 0 if, and only if, x = 0. These spaces are over 
the field C. By considering sequences of real numbers one obtains similar spaces 
over the field R which, ambiguously, are also referred to as l p spaces. 

To further investigate the structure of l p , we need the following elementary result. 

Proposition 2.13 (Young’s Inequality) The inequality 

aP b q 

ab < 1 

P q 

holds for all positive real numbers p and q satisfying 1/p + 1/q = 1, and non- 
negative real numbers a and b, with equality if and only if, a p = b q . 

Proof Recall the weighted arithmetic-geometric means inequality 

w / wr„w z „ w l x l + w 2*2 

V X 1 X 2 < w 

which holds for all non-negative real numbers x \ , xj, and all positive weights w i , W 2 , 
with w = w i + W 2 - Applying this inequality to the numbers xi = a p and X 2 = b q , 
with weights w i = 1 / p and W 2 = 1 / £/ , yields the desired inequality. □ 

The next inequality is used often when manipulating elements of the spaces i p . 
It is used below to establish the triangle inequality when proving that each l p space, 
with its i p norm, is a normed space. It is convenient to first introduce the following 
notation. For any two sequences x = (xi, . . . , x k , ■ ■ .) and y = (yi, . . . , yk, • ■ •), let 

xy = {x\y\, ...,x k y k , ...), 

namely, the vector of component-wise products of the given vectors. 

Theorem 2.14 (Holder’s Inequality) Given positive real numbers p and q for which 
1 / p + 1/q = 1, the inequality 


IMIl < Ik lip • llyll, 

holds for all x G l p and y e l q . 

Proof The case where either ||x|| p = 0or||y|| ? =0is trivial and so we may assume 
this is not so. By component-wise division of x and y, respectively, by ||x|| p and 
||y|| 9 , we may assume that ||x|| p = ||y|| 9 = 1, namely that 

oo oo 

2>Jtl / ’ = l = 2>*l , I 

k= 1 k= 1 

and we need to show that ||xy || i < 1, or in other words that 
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oo 

y < i . 

jt=i 


By Young’s Inequality (Proposition 2.13) we have that 

Uk\ p , lytl 9 

raytl = m\yk\ < 1 


and therefore 


11 

y < - + - = 1, 


*=1 


p 9 


as required. 

Theorem 2.15 (Minkowski’s Inequality) Let 1 < p < oo. The inequality 


□ 


II* + y\\ P < \\x\\ p + || y Up 

holds for all jje l p . 

Proof Notice that 

oo oo oo 

\\x + y\\p = y I x k + y k \ p = y |*jt||x* + y k \ p ~ 1 + y \yk\\xk + yk\ p ~ l - 

k= 1 k= 1 


Applying Holder’s inequality (Theorem 2.14) with the given p and the associated 
q — p/(p — 1), we obtain 


2>n**+»i' 1 < cy \xk\ p ) p cy i xk + yk\ p f p 

k= 1 k= 1 it=l 




x + y||p 
x + y||p 


and a similar inequality for the second summand. It follows that 


II* + y\\ p P < (ll*llp 


\\y\\p) 


* + y\\ p p 

* + y\\ P 


and, by simplifying, the claimed inequality is established. 


□ 


Theorem 2.16 For all 1 < p < oo the space i p is a normed linear space. 

Proof The case p = oo is left for the reader, so assume that p < oo. The non- 
negativity of || x ||p is clear from the definition of the norm and it was already noted 
that it is immediate that ||x|| = 0 implies x = 0. Homogeneity, is also immediate, 
since 
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oo 


a* II p = ^ \axk\ p = laHlxllp 


k=l 


and finally the triangle inequality 


II* + ;y||p < 11*11/7 + II y lip 


is precisely Minkowski’s Inequality (Theorem 2.15). 


□ 


2.5.5 The Family ofPre-L p Spaces 

With the large family {£ p } i< p <oo of normed spaces at hand one can put the general 
theory to use for many sequences. Noticing that l p C l q for all 1 < p < q < oo, and 
with sequences in t p converging faster to 0 than sequences in i q , given a bounded 
sequence one would typically try to identify a suitable p such that the sequence is 
found in l p , and proceed to apply the general theory to the problem at hand. 

As long as sequences are concerned, the l p spaces are adequate but, quite often, 
when modeling a physical problem mathematically one obtains a function rather 
than a sequence. It is thus desirable to consider suitable normed spaces of functions. 
As explained above, obtaining the correct analogue of the l p spaces, namely the L p 
spaces, requires some more sophisticated machinery than what we had presented so 
far, and so at this point we present what we call the pr e-L p spaces. 

For the sake of simplicity, let us consider the ambient linear space C([0, oo), R) 
of all continuous function x : [0, oo) — > R (we note that with the necessary mod- 
ifications in the coming definitions and proofs, one may alter the domain, as well 
as consider complex-valued functions). The pre-L p space, for 1 < p < oo, is the 
subset K p of C([0, oo), M) whose elements are those functions x with 


OO 


/ 


dt \x(t)\ p < oo 


o 


where the integral is the usual Riemann integral (since we only consider continuous 
functions, the theory of Riemann integration is sufficient). The L p norm of x e K p 
is given by 



and we immediately note that ||x|| = 0 implies x = 0 (due to the continuity of x). 
Lastly, Koo is the space of all bounded continuous functions x : [0, oo) — > M and 
the Loo norm is given by 
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* II oo 


sup {|xU)|}. 


0<t<OO 


The rest of the section is devoted to showing that K p with the L p norm is a normed 
space, for each 1 < p < oo. In fact, the proof is formally identical to the case of the 
lp spaces. Indeed, the only ingredient one needs is the following version of Holder’s 
Inequality. 

Theorem 2.17 (Holder’s Inequality) Given positive real numbers p and q such that 
1/ p + l/q — 1, the inequality 


ll^ylli < \\x\\ p ■ ||y|| 9 

holds for all x 6 K p and y e K q , where xy is the function ( xy)(t ) = x(t)y{t). 


Proof An inspection of the proof of Holder’s Inequality for sequences reveals that it 
was obtained by a point-wise application of Young’s Inequality as well as properties 
of summation which are well-known to hold for integration as well, and so the same 


□ 


argument can be used to establish Holder’s Inequality for functions. 


The details of the proof of the following result are now formally identical to the proof 
of Theorem 2.16 

Theorem 2.18 For all 1 < p < oo the pre-L p space K p is a normed linear space. 

Remark 2.18 Note again that the domain of the functions x in a \)tc-L p space can be 
altered (sometimes with necessary extra care) and that the codomain can be replaced 
by C (with obvious adaptations to the definition of the L p norm). All such spaces 
are still called pr &-L p spaces and denoted, ambiguously, by K p . In particular, the 
space C([a , /;], M) of continuous real- valued functions on a closed interval can be 
endowed with an L p norm and thus is a K p space. 


Exercises 


Exercise 2.41 Prove the generalized Pythagoras’ Theorem: In an inner product 
space, given m pairwise orthogonal vectors x \, . . . , x m , the equality 


11*1 + • • • + x m 


2 — 11*1 11“ + ' ' ' + ll*m 


2 


holds. 


Exercise 2.42 Prove that I 2 when endowed with the operation 


OO 



is an inner product space. 
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Exercise 2.43 Prove that l 0 0 with the i 0 0 norm is a normed space and that with 
the Loq norm is a normed space. 

Exercise 2.44 Let V be an inner product space. Prove the parallelogram identity 
Ik + y \\ 2 + II* — y \\ 2 = 2(||x|| 2 + Ikll 2 ) 


for all x, y e V. 

Exercise 2.45 Prove that for p ^ 2 the space i p is not an inner product space. 
Exercise 2.46 Does the equality 


to=f)t P 

p> 0 


hold? 

Exercise 2.47 For x e l p , with 1 < p < oo, does the equality 

lim |k Ik = Ik II oo 


hold? 

Exercise 2.48 Prove that K 2 is an inner product space. 

Exercise 2.49 Let 1 < p < 00 and consider, for any x e K p the sequence of 
samples s(x) = (x(l), x(2), . . . , x(k), . . .). Prove that .v is a linear operator s : 
K p — > lp. How do the norms ||x|| p and |k(x)||p compare? 

Exercise 2.50 Prove Holder’s Inequality for functions and prove that K p with the 
L p norm is a normed space. 

Further Reading 

Introductory level texts on linear algebra typically treat linear spaces with a strong 
emphasis on techniques of matrices. When bases are not explicitly available, as in 
the infinite dimensional case, this approach must be replaced by a coordinate-free 
approach, as was done in this chapter. For an introductory level text treating linear 
algebra in a coordinate-free fashion see Chaps. 10-12 of Dummit and Foote [2], A 
more advanced text with a coordinate-free approach, as well as an algebraically much 
deeper approach to linear algebra, making the connection between linear spaces and 
modules, is Roman [3]. Another text that bridges the gap between matrix-centric 
linear algebra and Hilbert space theory is Brown [1]. The reader seeking to enhance 
her intuition and ability with the Cauchy-Schwarz inequality is advised to consider 
Steele [4]. 
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Chapter 3 

Topological Spaces 


Abstract This chapter, which is quite independent from the previous one, introduces 
topological spaces. It includes a detailed motivation for the definitions so as to assist 
in the digestion of the rather abstract concepts involved. The chapter is, by necessity, 
only a glimpse of the vast realms of topology. The presentation and the choice of 
concepts and results given are geared towards the applications of topology in Hilbert 
space theory while ensuring the reader develops a sufficient level of familiarity with 
the techniques (and the at times eccentric nature) of topology. 

Keywords Topological space • Continuous functions • Separation axioms • Count- 
ability axioms • Hausdorff space • Compact space • Connected space • Product 
topology • Quotient topology • Generated topology 

Topology is often described as rubber-sheet geometry, i.e., the study of geometric 
properties that are insensitive to stretching and shrinking, without tearing or gluing. 
Famously, for a topologist then, there is no difference between a cup and a donut. 
The importance and impact of topology on modern mathematics is quite difficult to 
quantify but virtually impossible to exaggerate. Without a doubt its place as one of 
the pillars of mathematics is secured. 

The definition of a topological space is an abstraction of part of the structure of 
M, specifically that part that allows one to speak of continuity and convergence. The 
choice for the topological notions introduced in this chapter is dictated by two needs. 
One need is to familiarize the reader with those topological concepts that are most 
relevant in the context of this book, the other is to familiarize the reader with topology 
proper, its main results, and its techniques. These two needs pull in slightly different 
directions and this chapter represents a compromise. We aim to provide the reader 
with sufficient motivation for the concepts while keeping things firmly grounded in 
analysis. 

Section 3.1 presents the definition of a topology and the accompanying notion 
of continuity, followed by a detailed construction of the Euclidean topology on the 
real numbers 1R. The associated notion of continuity is shown to be equivalent to 
the usual one, motivating the definitions. Section 3.2 is a basic study of convergence 
in the topological setting, its relation to cluster points, and its behaviour under the 
first countability axiom. Section 3.3 is concerned with techniques for constructing 
topologies, in particular by forcing a collection of sets to be open sets and by forcing a 
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collection of functions to be continuous. Coproducts, products, and quotients are then 
given as particular examples. Section 3.4 discusses the Hausdorff separation property 
and topological connectivity, and Sect. 3.5 is a study of compactness, developing 
enough material to prove the Heine-Borel Theorem from a topological perspective. 


3.1 Topology — Definition and Elementary Results 

The definition of a topology is quite simple; a certain collection of subsets of a set 
satisfying some simple to state axioms. However, without further motivation, the 
axioms may appear arbitrary and not too palatable. To assist the reader digest the 
notion, the definitions below are immediately succeeded by a detailed motivating 
discussion taking place in the familiar setting of the real numbers R. The motivation 
we give is of a topology as the result of stripping away redundancies in the definition 
of continuity. We then explore various examples of topological spaces. 


3.1.1 Definition and Motivation 

We present the definition of a topological space followed by the definition of a 
continuous function between topological spaces. The abstract definitions are then 
exemplified in the context of the real numbers, establishing the Euclidean topology 
on M, and, through the process, justifying the abstract definitions. 

Definition 3.1 A topology xx on an arbitrary set A is a collection of subsets of A 
such that the following conditions hold. 

1. Both the empty set 0 and the entire set X are members of rx- 

2. xx is stable under arbitrary unions. That is, if { t/ f - } / 6 / is a family of elements in 
xx, then 

U Ut e rx. 

iel 

3. xx is stable under finite intersections. That is, if Hi .... , U m are members of rx, 
then 


U\ n ■ • • n U m g xx . 


The pair (A, xx) is called a topological space and the members of xx are referred to 
as open sets. Often though, we speak of the topological space A without explicitly 
mentioning xx ■ We write (A, r) if there is no need to syntactically remind ourselves 
that the collection r consists of the open sets of A. 

Remark 3. 1 Clearly, stability under finite intersections is equivalent to the condition 
that U\ fl U 2 e xx for any two open sets U \ , U 2 e rx- 
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For the following definition, and as a general advise for safely taking the first few 
steps into the realm of topology, the reader may wish to review the basic properties 
of inverse images (e.g. Sect. 1.2.9 of the Preliminaries). 

Definition 3.2 Let (X, xx) and ( Y. ty) be topological spaces. A function / : X — > Y 
is said to be continuous if f~ l (U ) is an open set in X whenever U is an open set in 
Y. In other words, / is continuous if 

Uer Y => /"'(I/) e x x . 

The following results establish a particular topology on the set R. of real numbers, 
and investigate the notion of continuity with respect to that topology. The aim is two- 
fold; to exemplify the definitions in a concrete and familiar setting, and to motivate 
the definitions. 

Recall that a function / ; R -> R is said to be continuous if, intuitively, small 
changes in the value of x cause small changes in the value of fix). Formally, / is 
continuous if for every point xq e R and s > 0 one can find a 8 > 0 such that if 
\x — jco| < <5, then | fix) — fix o)| < s. The absolute value function is used here 
in the role of a measurement device for distances. However, the notion of continuity 
is quite insensitive to numerous changes one can perform on the absolute value 
function. Intuitively, continuity is a local phenomenon, and so if one truncates the 
absolute value function and defines 


\x\ if \x\ < 1 

\x\t = ■ 

1 otherwise, 

then continuity with respect to |x| and with respect to \x\ t coincide. Scaling effects, 
such as defining |x |2 = 2|x|, also have no effect on continuity in the precise sense 
that continuity with respect to |x| and with respect to |x I 2 coincide. 

The decision to use the absolute value function in formulating the definition of 
continuity (or, more generally, that of limit) is thus revealed to be an arbitrary choice 
among many equivalently valid alternatives. It is now a natural question whether one 
can distill a definition of continuity devoid of any arbitrary and irrelevant structure 
and consequently furnish a better understanding of the concept of continuity. The 
affirmative answer is given as follows. Call a subset l/cl open if for every x e U 
there exists an s > 0 such that If (x ) C U, where we define 

Z/(x) = {y e R | |x — y\ < s) = (x — e, x + e), 

the open interval of length s about x. With the aim of utilizing these sets to define a 
topology on R, we first prove that any open interval is an open set. 

Proposition 3.1 The open interval B E ix) = (x — e, x + s),for all x eM.ande > 0, 
is an open set. 
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Proof Given y e B E (x) = (x — s, x + s ), we are required to find a 8 > 0 such that 
Bg(y) = (y ~ S,y + 8) c B e (x). Let 

8 — s — \x — y\ 

and note that 8 > 0. It is now elementary algebra to verify that this 8 suits our 
purposes. □ 

We can now present our first example of a topology; an important enough example 
that we state it as a theorem. 

Theorem 3.1 The collection r of all open sets U CK is a topology on R. 

Proof The empty set vacuously satisfies the condition for being an open set, simply 
since it has no points at all. The set R itself is clearly an open set, since ifx e R, then 
x e B\ (x) c R. Thus the first condition in the definition of a topology is satisfied. 
To show stability under arbitrary unions, suppose that {t/, } ie / is a family of open 
sets in R and we need to show that 


U = U U > 

iel 

is open. Indeed, if x e U, then x e Uj for some iel. But since U, is open, there 
exists a 8 > 0 with 


B s (x) c Ui c U, 

as required. It remains to show for any two open sets U\, U2 R R that 6'i fi Uj is 
open. Indeed, if x e U\ fl U2, then there are > 0 and 8 2 > 0 with 

Bsfx) c Ui 
B& 2 (x ) c U 2 . 

Setting 8 = min{5i, 82} (and noting that 8 > 0), one easily sees that 
Bs(x) C B Sl (x) fl B S2 (x ) c Ui n U 2 , 

as required, and thus completing the proof. □ 

Definition 3.3 The collection of all open sets U c. R as just defined is called the 
Euclidean topology on R. It is also commonly referred to as the standard topology 
or the ordinary topology. 

The open sets turn out to contain enough information about the space R in order to 
completely characterize continuity, as follows. 

Theorem 3.2 A function f : R — > R is continuous in the usual meaning if, and 
only if / — 1 (C/) is an open set for every open set U C R. 
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Proof Suppose / is continuous, and let U be an open set. Our aim is to show that 
f~ l (U ) is open. Let x$ e (U), that is yo = fix q) e U, and our aim is to find a 

S > 0 such that 

Bs(x o) c f~\U). 

Since U is open, there exists an s > 0 such that 

Bsiyo) = (Jo - s, y 0 + e) ^ U . 

By continuity of /, there exists a 8 > 0 such that / (x) — yo| < e, provided that 
\x — xo| < <5. But that means that for every x e Bs(xo) 

fix) e B e (y o) c U. 

In other words, x e / _1 (C) for all x in the set Bs(x o), and so 

Bs(x 0 ) c f~\U), 
which is what we set out to obtain. 

In the other direction, suppose that f~ l (U) is an open set whenever U is an 
open set. Given a point rj el and an e > 0, let yo = fix o) and consider the set 
U = /I ( , (yo), and notice that U is itself an open set. By assumption, the set / _1 (f/) 
is an open set, and it contains xq. There is thus a S > 0 such that B$ (xo) ^ f~ l (U)- 
Thus, if \x — Jtol < 8, then x e Bsix o), and therefore fix) e B e (yo), showing that 
| fix) — fix o)| < s. In other words, / is continuous at xq. Since .ro was arbitrary, 
/ is continuous. □ 

Recalling Definition 3.2, the result above can be restated as follows. 

Theorem 3.3 A function f : R. — >■ R. is continuous in the usual sense if and only 
if, it is continuous with respect to the Euclidean topology on M. 

This result justifies and motivates Definition 3.2 and, a-priori, the definition of a 
topology. Let us pause to reflect on the situation right now. The familiar notion 
of continuity for functions / : R — > R was seen to be faithfully captured by the 
collection of open sets (/Cl, The collection of open sets was defined in terms of 
B e ix) = {y e R | \x — y| < e}, thus still directly using the absolute value function. 
It would thus seem that we had accomplished nothing except for hiding the absolute 
value function in some fancy notation. However, something quite substantial was 
achieved. Consider the scaled absolute value function |x|t = 2 ■ \x\, and let us 
define 

Bi.eix) = {y e R | |x - y| 2 < e} = O - * + ~) 
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for all x e M and e > 0. Let us further say that a set U C R is 2-open if for all x e U 
there exists an s > 0 such that B 2e f 0 ^ U. In other words, we repeat the definition 
of open sets, replacing the absolute value function by a scaled version of it. 

Proposition 3.2 The collection of open sets and the collection of 2-open sets are 
the same. 

Proof Notice that 


B e / 40) C B 2 . e (x) c B e (x) 

holds for all x e R and e > 0. Thus, if U is an open set and x e U, then 

B e (x) c U 


for some e > 0, but then 


B 2 ,s(x ) C B b (x) c U 

and so U is also 2-open. Conversely, if U is 2-open and x e U, then 

BiAx) c U 


for some e > 0, but then 


Be/4(x) c B 2 ,e(x) C £/ 


and so £/ is also open. □ 

Similarly, one may show that the collection of open sets is insensitive to other changes 
in the absolute value function, such as scaling by other factors or various truncations. 
It is not a coincidence that those changes to the absolute value function that are 
immaterial to the notion of continuity are also immaterial to the concept of open set. 
Indeed, Theorem 3.2 characterizes continuity in terms of the open sets alone. 

Thus, by concentrating on the open sets one is freed from the irrelevant (for the 
purposes of continuity) particularities of the absolute value function. The definition 
of a topology is thus an abstraction of a particular formulation of continuity which 
avoids directly mentioning any particular concept of R which is irrelevant to the 
notion of continuity. As with any abstraction, once it is made, a plethora of new 
situations emerges, situations that superficially have little or nothing to do with the 
geometric notion of continuity. Introducing a topology on a set immediately allows 
one to import a significant amount of geometric intuition, machinery, and analogy 
into an a-priori non-geometric context. 
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3.1.2 More Examples 

We present more topological spaces, illustrating the versatility of the formalism. We 
remark that there are far more interesting examples of topological spaces than can 
be recounted in this book. The choice of the examples given below is dictated by the 
focus of the book; the examples are chosen to elucidate the forthcoming topological 
concepts, while avoiding delving into the intricacies of the more obscure examples of 
topological spaces, thus allowing the reader to ease into the topological framework 
while remaining firmly grounded in analysis. 

Example 3. 1 ( Discrete And Indiscrete Topologies ) Any set X supports two extreme 
topologies on it. One is the collection {0, A}, called the indiscrete topology on X 
and the other is the collection of all subsets of X, called the discrete topology on X. 
It is immediate to verify that these are indeed topologies, and that they are distinct 
topologies except when |A| < 1. 

Example 3.2 (The Sierpinski Space) Let § = {0, 1} be a set with two elements. The 
collection 

{0, §,U}} 

is readily seen to be a topology on S giving rise to what is known as the Sierpinski 
space. 

Example 3.3 Let X = [a, b, c, d, e} be a nonempty set consisting of five elements. 
Of the collections 


r = { X , 0, {a}, {c, d], [a, c, d }, {b, c, d, e}} 

G = {X, 0, {«}, {c, d], { a , c, d}, {b, c, d}} 

H = [X, 0, {a}, {c, d), {a, c, t/} , {a, b, d, e}} 

r is a topology on X as the axioms are easily confirmed by inspection. G is not a 

topology on X, since {a} U [b, c, cl] = {a, b, c, d} ^ G. The collection H is also not 
a topology on X, since {c, d] fl [a, b, d , ej = {J} ^ H. 

Example 3.4 ( The Cofinite and Cocountable Topologies ) Let A be a set and r the 
family of all subsets U of A whose complement A — U is finite, and if needed, 
manually include 0 in r. We show that r is a topology on A. First, 0 e r by 
definition, and since A — A = 0 is finite, it follows that A e r, and so the first 
condition in the definition of topology is satisfied. Stability under arbitrary unions is 
immediate, so finally consider two sets U\,U 2 with finite complements in A. Since 


A — (U\ n Ui) = (A - u i) u (A - u 2 ) 


and since both (A — U i) and (A — Uj) are finite, it follows that (I \ n Uj has finite 
complement too. This topology is called the cofinite topology on A. Notice that if 
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X is finite, then the cofinite topology coincides with the discrete topology on X 
(Example 3.1). 

Similarly, call a subset U C X cocountable if its complement X — U is a countable 
set. The collection of all cocountable subsets of X (with 0 added in manually) also 
forms a topology on X called the cocountable topology. The proof is similar to 
the proof given above, with the necessary care when handling countable sets (see 
Sect. 1.2.13 of the Preliminaries). 


3.1.3 Elementary Observations 

We explore some immediate properties of topological spaces and of continuous 
functions, and we discuss the notion of homeomorphism. We start off with a useful 
criterion for detecting open sets in any topological space. 

Proposition 3.3 A subset V of a topological space X is open if, and only if for all 
x € V there exists an open set U x such that 

x e U x C v. 

Proof If V is open, then clearly for every x e V one may choose IJ x = V. In the 
other direction, the stated condition implies that 

V = IJ U x 

xeV 

and so V is expressed as the union of open sets, and is thus itself open. □ 

Proposition 3.4 The composition g o / : X — > Z of any two continuous functions 

f g 

X — — >■ Y — - — 5 - z between topological spaces is itself continuous. 

Proof Given an open set U in Z, we need to verify that (g o /) _1 (f/) is open in X. 
Indeed, 

(g°fT l (U) = f-\g-\u)) 

and since g is continuous it follows that g _1 ({/) is open in Y, and since /, too, is 
continuous it follows that f~ l (g~ l (U)) is open in X. □ 

Proposition 3.5 Given two topologies, x\ and t 2 , on the same set X, the identity 
function id : ( X , ti) — »■ (X, xf) is continuous if, and only if, t 2 C n. 

Proof Notice that for the identity function id : X — > X and any subset U C X (open 
or not) one has id - 1 (t/) = U. Thus, the condition for continuity is precisely that 
U e x\ for all U e xj, in other words, that t 2 c t|. □ 
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Definition 3.4 Given two topologies t\ and xj on the same set X, if x\ C xj, then 
x\ is said to be coarser than t 2 while xj is said to be finer than x\. 

Notice that the discrete topology on X is the finest topology among all topologies on 
X while the indiscrete topology is the coarsest one. In general, different topologies 
on the same set may be incomparable, i.e. one need not be contained in the other. 

Definition 3.5 A function / : X — » Y between topological spaces is said to be 
a homeomorphism if / is bijective and both / : X — > Y and / _1 : Y —> X are 
continuous functions. If there exists a homeomorphism between X and Y . then X 
and Y are said to be homeomorphic spaces. 

Remark 3.2 One should not confuse between the terms homeomorphism and 
homomorphism. The term homomorphism in algebra is typically used to indicate 
a structure preserving function, with isomorphism used for the invertible structure 
preserving functions. In topology, the structure preserving functions are precisely 
the continuous functions, and the invertible structure preserving functions are called 
homeomorphisms. 

Note that f : X —> Y is a homeomorphism if, and only if, for all U C Y 
f~ l (U)e r x <=> U e xy. 

Homeomorphic spaces thus have essentially the same collections of open sets, up to 
a renaming of the elements. Intuitively, two spaces are homeomorphic if one can be 
continuously changed, without gluing or tearing, to obtain the other. While homeo- 
morphic spaces must have the same cardinality (since a homeomorphism is in partic- 
ular a bijection), the converse quite often fails since both the bijection and its inverse 
must be continuous. For instance, the circumference of a circle and a line segment 
(say in M 2 ) have the same cardinality, but they are not homeomorphic. This is intu- 
itively quite clear since (it would appear that) the only way to create the hole that the 
circle encloses is to glue the two ends of the line segment, but that changes the topol- 
ogy (i.e. glueing is a continuous operation but its inverse, tearing, is not). However, 
turning such intuitive arguments into formal ones is usually not straightforward. 

Generally speaking, it can be quite tricky to prove that two topological spaces 
are not homeomorphic. A common technique in establishing such a result is the 
following one. A property P about topological spaces is said to be a topological 
invariant if for all homeomorphic topological spaces X and Y, X has property P 
precisely when Y does. To show that two spaces are not homeomorphic, it thus 
suffices to find a topological invariant that only one of the two spaces possesses. The 
topological concepts introduced in the rest of this chapter often provide one with a 
suitable topological invariant. 
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3.1.4 Closed Sets 

With every subset S of a set X one may associate the complementary set S c = X — S, 
giving rise to a bijective operation PF(X) —> PF(X) on the set PPiX) of all subsets 
of X. It is thus clear that any concept given in terms of subsets of X gives rise, by 
taking complements, to what must essentially be an equivalent concept. This general 
observation holds for the definition of a topology, and we now devote some time to 
the relevant details. 

We first observe that the Sierpinski space S = {0, 1} from Example 3.2 can be 
used to identify the open sets in an arbitrary topological space. 

Proposition 3.6 There exists a bijective correspondence between the open sets of a 
topological space X and con tin uous functions f : X — »■ S. 

Proof Recall (e.g. Sect. 1.2.10 of the Preliminaries) that with every subset S C X 
one may associate the indicator function 


fs O) 


I 

0 


if v e S 
if x £S 


and that this correspondence is a bijection between all subsets of X and all functions 
X — > S. Now, for fs : X —> E to be continuous, the inverse image of every open set 
in S, namely the sets 0, S, and { 1 }, must be open in X. The set ff 1 (0) = 0 is always 
open in X and similarly so is fj (§) = X. So, the only further condition imposed 
by continuity is that ff l ({1}) must be open too, and since the latter set is precisely 
S, we see that continuous functions / : X — > S correspond bijectively to the open 
subsets of X. □ 


Clearly, every function / : X -» § is equally well completely determined by the 
inverse image of 0 instead of the inverse image of 1 . We thus see that there is also a 
correspondence between continuous functions / : X — > § and the complements of 
the open sets in X, giving rise to the following definition. 

Definition 3.6 A subset F of a topological space X is said to be closed provided its 
complement X — F is open. 

The preceding discussion implies that a topology can be specified by the collection 
of closed sets. The details are given in the following result. 

Theorem 3.4 Let X be an arbitrary set. A given collection ,F of subsets of X is 
the collection of closed sets for some topology on X if and only if the following 
conditions hold. 

1. Both the empty set 0 and the entire set X are members of PF. 

2. PF is stable under finite unions. That is, if F \, . . . , F m are members of PF, then 


F\ U • • • U F m e PF . 
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3. & is stable under arbitrary intersections. That is, if{Fj}j e i are members of fF , 
then 

n e 

iel 

Moreover, a collection 5F satisfying the conditions above is the collection of closed 
sets for a unique topology on X. 

Proof Applying De Morgan’s Laws (see Sect. 1.2.5 of the Preliminaries), the details 
are straightforward and are left to the reader. We only stress out here that the collection 

{X- F | F e &}. 

is the unique topology determined by & . □ 

Example 3.5 If a set X is given the discrete topology, then every subset of it is closed, 
while if X is given the indiscrete topology, then the only closed subsets of X are 0 
and X itself. In the Sierpinski space § = {0, 1}, the closed sets are: 0, S, {0}. 

Example 3.6 Consider R with the Euclidean topology (Definition 3.3). It is easy to 
verify that for a, b e M with a < b, the open interval (a, b) is an open set and the 
closed interval [a, b] is a closed set. The open rays (a, oo) and (— oo, a) are also 
open sets, and the closed rays [a, oo) and (— oo, a] are closed sets. Every singleton 
set {a} is a closed set. Consequently, every finite set F C R is closed. 

Remark 3.3 It should be emphasized that in general an arbitrary union of closed 
sets need not be closed nor does an arbitrary intersection of open sets need be open. 
For instance, consider R with the Euclidean topology (Definition 3.3). Since every 
singleton set {x} is closed, every subset X C R is the union of closed sets, i.e. 

X= \jw 

xeX 


but, of course, not every subset of R is closed. Similarly, for any x e R, one has 

{x} = P'1 (x — e, x + e) 

£>0 

showing that {x} is an intersection of open sets, but is itself not open. 

Remark 3.4 It is important to realize that a set is not a door. A set need not be open 
nor closed, while it may be both open and closed. For instance, consider the topology 
r from Example 3.3. One easily verifies that {a, c) is not closed nor open and that 
{a} is both closed and open. Sets that are both closed and open are said to be clopen. 
Notice that the empty set 0 and X itself are clopen in any topological space. If these 
are the only clopen sets in X, than X is said to be connected, a property we explore 
further in Sect. 3.4. 
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The dual nature of open and closed sets is seen in the following result. Recall that a 
function / : A — > Y between topological spaces is continuous if f~\U ) is open in 
A for every open set U in X. 

Theorem 3.5 A function f : X — >■ Y between topological spaces is continuous if, 
and only if, f~ l (F) is a closed set in X for every closed set F in Y. 

The proof is, as it should be, a tautology that the reader is urged to clarify for herself. 

The duality between open and closed sets is quite helpful since while dual to each 
other, open and closed sets have different topological qualities and thus it may be 
easier to establish a result in terms of closed sets rather than open ones, or vice versa. 
The reader should be warned though that not all concepts that may appear dual are 
in fact dual. 

Definition 3.7 A function / : X — > Y between topological spaces is 

• an open mapping if f(U) is open in Y for every open set U in X. 

• a closed mapping if f(F) is closed in Y for every closed set F in X. 

The reader is invited to find examples of open mappings that are not closed as well 
as closed ones that are not open. 


3.1.5 Bases and Subbases 

Generally speaking, a topology is a very large collection of subsets. For instance, the 
Euclidean topology on R has uncountably many open sets, ft is thus often desirable 
to obtain smaller (or at least more manageable) collections that give one access to 
the entire topology. This is what bases and subbases are designed to achieve. 

Definition 3.8 A collection f'/> of open sets in a topological space A is a basis (or 
a base ) for the topology tx if for every open set U C X and x e U, there exists an 
element De J such that 


jc e B c u. 

The elements of 8$ are called basis elements. 

Example 3.7 For every topological space (A, tx), the collection 8§ = tx is, quite 
trivially, a basis. More interestingly, the collection {B e (x) x e R, s > 0} is a basis 
for the Euclidean topology on M. Indeed, if U C R is open, then, by definition, for 
each x e U there is an e >0 such that 

x e B e (x) c u. 

If A is an arbitrary set endowed with the indiscrete topology, then the collection {A} 
is a basis. Indeed, there are at most two open sets in the indiscrete topology, namely 
0 and A. So if U is open and x e U, then necessarily U — X and clearly 
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x e X c x. 

If X is endowed with the discrete topology, then the collection {{*} | x € X] of all 
singleton subsets of X is a basis. Indeed, in the discrete topology every U c X is 
open, and if x e U, then clearly 


x e {*} c U. 

Definition 3.9 A collection s/ of open sets in a topological space X is a subbasis 
for the topology if the collection {A\ fl • • • fl A m \ A\, , A m e s/, m > 0} of all 
finite intersections of elements of srf is a basis for the topology. 

Remark 3.5 By convention, the empty intersection (i.e. the case m = 0 above) is 
interpreted to be the set X itself. 

Example 3.8 Consider the set of real numbers R. with the Euclidean topology. The 
collection 


{(a, oo) | a e R} U {(— oo, b) | b e M} 
of all open rays is a subbasis. Indeed, the open rays are open sets and since 
B s (a) = (a — e, oo) fl (— oo, a + s) 

we see that the collection of all finite intersections of the open rays contains the 
basis we presented above. It follows that the collection is a basis since, in general, if 
AS' 3 AS are collections of open sets and AS is a basis, then so is AS' . 

We now show that bases can be used directly to detect continuity of functions. 
Proposition 3.7 Let f : X — ► Y be a function between topological spaces and let 
AS be a basis for the topology on Y . Then f is continuous if, and only if f~ l {B) is 
open in X for all basis elements B e AS. 

Proof If / is continuous, then f~ l (U ) is open for all open sets U C Y. Since any 
B e AS is in particular an open set in Y, it follows that f~ l (B ) is open in X. In the 
other direction, assume that /“* (B) is open for all B e AS and let U be an arbitrary 
open set in Y . It follows at once from the definition of basis that we may express U 
as a union of basis elements 

U = U . 

xeU 


It then holds that 

r\u)= U r l (B x ) 

xeU 

and since each f~ l {B x ) is assumed open, the set f~ l (U ) is thus expressed as the 
union of open sets, and so is open. Thus f~ l (U) is open for all open sets U in Y , 
and thus / is continuous. □ 
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We close this section by introducing a local variant of the notion of basis. 

Definition 3.10 Let X be a topological space and x e X. A collection 38 of open 
subsets of X is said to be a local basis at x if for all 5 6 J 

x e B 

and if for every open set U in X with x e U, there is an element B x e 38 (called a 
basis element ) with 

B x c U. 

Every basis 38 for the topology on X clearly gives rise to the local basis 

3g x = {B e 38 \ x e B} 

at x. It is also the case that if for every x e X one has a local basis 38 x at x, then the 
collection 

®= U SBx 

I€l 


is a basis for the topology on X. We leave the verification of these simple claims to 
the reader. 

Exercises 

Exercise 3.1 How many topologies are there on a set with three elements? 

Exercise 3.2 Characterize all continuous functions / : S — > JR from the Sierpinski 
space to R. with the Euclidean topology. 

Exercise 3.3 Prove Theorem 3.5. 

Exercise 3.4 Show that the concepts of open function, of closed function, and of 
continuous function are independent of each other. 

Exercise 3.5 Give an example of a topological space with infinitely many open sets 
in which any two non-empty open sets have non-empty intersection. 

Exercise 3.6 Let X and Y be sets. When X is equipped with the discrete topol- 
ogy and Y with an arbitrary topology, show that every function / : X — » Y is 
continuous. 

Exercise 3.7 Let X and Y be sets. When Y is equipped with the indiscrete topol- 
ogy and X with an arbitrary topology, show that every function / : X — > K is 
continuous. 

Exercise 3.8 Show that any constant function / : X — > Y between topological 
spaces is continuous. 


3.1 Topology — Definition and Elementary Results 


89 


Exercise 3.9 Show that any continuous function / : X — »■ Y, when Y is given the 
discrete topology and X is given the indiscrete topology, is a constant 
function. 

Exercise 3.10 Let X = R endowed with the Euclidean topology and let Y = R 
with the cofinite topology. Is the identity function id : X —> Y continuous? Is the 
identity function id : Y — > X continuous? 


3.2 Subspaces, Point-Set Relationships, and Countability Axioms 

This section introduces geometric notions regarding various possible relationships 
between a point and a set in a topological space. We also consider countability axioms 
and study the consequences they entail for some of these notions. 


3.2.1 Subspaces and Point-Set Relationships 

Any subset of a topological space naturally inherits a topology and thus one may 
speak of subspaces, which we discuss first. Next, various qualitative aspects of a 
given point and a subset of a topological space are introduced and some basic results 
are established. 

Definition 3.11 ( Subspace Topology) Let (A, x x ) be a topological space and Y C. X 
a subset. The collection 


r F = {U n Y | U e x x } 

is (easily seen to be) a topology on Y called the subspace topology. The topological 
space (Y, ty) is called a topological subspace of (X, r x ), or simply a subspace if 
the topological context is clear. 

Proposition 3.8 The subspace topology on a subset Y of a topological space X is 
the smallest topology on Y making the inclusion function i : Y — > X continuous. 

Proof Noticing that 

r l (U) = y n u 

for all U c X one sees that the inclusion function i : Y — > X is continuous precisely 
if Y fl U is open in Y whenever U C X is open in X. This condition clearly holds for 
the subspace topology xy on Y. Further, if r is any topology on Y, then the condition 
that i : Y — > X is continuous with respect to the topologies r on Y and x x on X 
amounts to the condition 


U ex x 


U nY ex, 
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and thus that ty C r, proving the minimality claim of the subspace topology ty. □ 

Given a subset and a point in a topological space, the open sets may be used to 
identify the qualitative relative location of the point in relation to the set. This is the 
subject of the following definition. 

Definition 3.12 Let X be a topological space, S c. X a subset, and x e X a point. 

1. x is an interior point of S if there exists an open set U such that x e U C S. 

2. x is an exterior point of S if there exists an open set U such that x e U C X — S. 

3. x is a boundary point of S if every open set U that contains x, contains elements 
in S as well as elements in X — S. 

4. x is a duster point or an accumulation point of S, if every open set U which 
contains x, contains elements of S — {x}. 

5. x is an isolated point of S, if {x} is an open set. 

Example 3.9 Let us consider the space R with its Euclidean topology. It is easily 
seen that for S = [a, b], a closed interval, as well as .S' = (a. b). an open interval: 

1. A point x is interior precisely when a < x < b. 

2. A point x is exterior precisely when either x < a or x > b. 

3. A point x is a boundary point precisely when x = a or x = b. 

4. A point x is a cluster point precisely when a < x < b. 

5. S has no isolated points, but the space S = [0, 1] U {2}, as a subspace of R. with 
the Euclidean topology, admits the point x = 2 as its only isolated point. 

Further, in M with the Euclidean topology, it is easy to see that every real number x 
is a cluster point of the subset (Q) of rational numbers. In general, an isolated point 
x is never a cluster point of any set S , simply since for the open set U = {x} one 
always has U fl (5 — {x}) = 0. 

Definition 3.13 Let X be a topological space and S C X a subset. 

1. The interior of S is the set int(S) = {x e X \ x is an interior point of 5). 

2. The exterior of S is the set ext (S) = {x e X \ x is an exterior point of S}. 

3. The boundary of S is the set 3 (S) = {x e X \ x is a boundary point of .S' | . 

4. The derived set of S is the set S' = {x e X \ x is a cluster point of S'}. 

5. The closure of S is the set S = S U S'. 

Proposition 3.9 Let X be a topological space and S c X a subset. It then holds 
that 

1. The interior int(S) is the union of all open sets contained in S. 

2. The exterior ext(S) is the union of all open sets disjoint from S. 

3. The closure S is the intersection of all closed sets that contain S. 

4. S = SU3(S). 
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Proof 

1 . Let T/ be the set of all open sets U C X with U C S, and let 

W = [J U. 

U&W 

We need to show that W = int(.S'). If x e W, then X e U for some U e , and 
since x e U C S we may conclude that x e int(.S'). Conversely, if x e int(.S'), 
then there exists an open set U such that x e U C S. and thus x e W. 

2. Notice that ext(S) = int( A" — .S'), and now apply 1. 

3. Let ffi be the set of all closed sets F C X with S C F, and let 

g= n f ■ 

F 

We need to show that G — S. Let x e G, and thus x e F for all F e ,'T , and 
our aim is to show that x e S. If x e S, then x e .S U S' = S so we may proceed 
under the assumption that x ^ .S'. We will show that in this case x e S' . Indeed, 
let U C X be an open set with x e U and suppose that U fl (S — {x}) = 0, which 
simplifies to U fl S = 0 since x f S. But then the closed set F = X — U contains 
S, and thus x e F, contradicting the fact that x e U. In the other direction, 
suppose that x e S = S U S', and we aim to show that x e G, namely that x e F 
for every closed set F with S C F. Ifx e .S, then clearly x e F, and we may thus 
proceed under the assumption that x f S, and thus that x e S'. Consider the open 
set U = X — F and note that U fl (S — {x}) = U fl S = 0. If x f F , then x e U 
and we obtain a contradiction with x e S'. It follows that x e /-’, as needed. 

4. To show that S c ,S U 3.S, suppose that x e .S but x ^ .S'. Given an open set U 

with x e U , it follows that U fl S ^ 0. Since x f S, namely x e X — S, we also 
have that x e U fl (X — S). In other words, any open set containing x, contains 
points from S as well as from X — S, thus x is a boundary point of S. In the other 
direction, suppose that x e S U 3 S, and we need to show that x e S = S U .S'. 
If x e S, then clearly x e S so assume that x f S, and thus x e 35. But then if 
U is an open set containing x, then U contains a point from S (and from X — S, 
though this is irrelevant here), but then U fl (S — {x}) f 0 since x f S. Thus x 
is a cluster point, as needed. □ 

Corollary 3.1 Let X be a topological space, and S C X a subset. Then 

1 . The interior int(.S) is the largest open set in X that is contained in S. In particular, 
int(5) is open. 

2. S is open in X if, and only if, S = int(.S). In other words, S is open precisely 
when all its points are interior. 

3. The exterior ext(.S) is the largest open set in X that is disjoint from S. In particular, 
ext (S) is open. 

4. S is closed in X if, and only if, X — S = ext (S). 
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5. The closure S is the smallest closed set in X that contains S. In particular, S is 
closed. 

6. S is closed in X if, and only if, S' C S (equivalently, if S = S). 


3.2.2 Sequences and Convergence 

We now turn to look at sequences in topological spaces, what it means for such 
sequences to converge, and how this notion relates to the topological concepts given 
above. 

Definition 3.14 Let [x m } m >\ be a sequence of points in a topological space X. A 
point xq e X is said to be a limit of the sequence {x m } m >\ if for every open set 
U C X with xq e U, there exists aniVeff such that 

m > N => x m e U. 

The sequence is then said to converge to xq and to be a convergent sequence. 

Just as continuity of functions can be detected using a basis, so can cluster points 
and limit points be so detected. 

Proposition 3.10 In a topological space X with a local basis £% at a point xo.' 

1 . xo is a cluster point of a subset S C X if and only if, B fl (S — {xo}) ^ 0, /or all 
basis elements B e S%. 

2. A sequence {x m } m >i converges to xo if, and only if, for every B £ 3$ there exists 
an N & N such that 

m > N => x m e B. 

Proof We only prove the first assertion, as the two proofs are very similar. If xo is a 
cluster point and B e 3B is a basis element, then, in particular, B is open and xo e B, 
which implies that B fl (S — {xo}) f 0. Conversely, suppose the stated condition 
holds and let U be an arbitrary open set with xo e U. By definition of basis at a 
point, there exists a basis element B e 38 with xq e B C U . By hypothesis then 
B fl (S — {xo}) ^ 0 and since U fl (S — {xo}) 5 B fl (S — {xo}), the proof is 
complete. □ 

Example 3.10 Consider the Euclidean topology on R and recall that a local basis at 
xo is given by {B e (xo)} £ >o- Thus, a sequence {x m } m >i in M converges to xo e M if, 
and only if, for every e > 0 there exists an N e N such that 

m > N => x m e B e (x 0 ), 

or, in other words: if m > N , then \x m — x| < e. We thus see that convergence in 
the topological sense given above is equivalent to convergence of sequences in M in 
the usual sense. 
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It is very reassuring that the notion of convergence can be captured topologically, but 
one should be cautious not to expect the familiar behaviour of limits in M to remain 
valid in arbitrary topological spaces, as the following example illustrates. 

Example 3.11 Let X be an arbitrary set endowed with the indiscrete topology. Since 
the only open sets are 0 and X, it follows immediately that any sequence { x m } m >\ in 
X converges and that any element x e X is a limit. Consequently, a sequence may 
converge to more than one point; uniqueness of limits is lost. 

Given a subset S of a topological space X . we can entertain (at least) two reasonable 
formalizations for the informal idea of the set of all points that are infinitesimally 
close to S. One possibility is given by the closure S = S Li S', where we adjoin to 
S all of its derived points. A second possibility is obtained by considering the set of 
all limit points of sequences in S. A-priorly it is not clear how these two possibilities 
compare, and we now turn to investigate this question. We limit the discussion just 
to the point of developing enough theory to suit the needs of this book. It should 
be noted that this is just the tip of the iceberg. The important observation is that, 
generally, sequences do not suffice to capture cluster points (but either nets or filters, 
two concepts we will not discuss, do). Having said that, we do identify below a broad 
enough class of topological spaces in which the limit behaviour is more intuitive, 
namely spaces satisfying the first countability axiom. 

Theorem 3.6 Let X be a topological space and S C X a subset. If a non- eventually - 
constant sequence in S converges to xo G X, then xo is a cluster point of 

the subset S. 

Proof Suppose that U is an open set and xo € U. Since {s m },„> i converges to xo, 
there exists an N e N such that s m e U for all m > N, and since the sequence is not 
eventually constant, an m > N exists with s m / xo- In particular, t/fl (S' — {xo}) ^ 0, 
and thus xo is a cluster point of S. □ 

Theorem 3.7 If f : X — > Y is a continuous function between topological spaces 
and x m — >■ xo in X, then f(x m ) — >■ /(xo) in Y. 

Proof Given an open set U C Y with /(xo) e U, we have that xo e f~ l (U) and 
thus there exists an N e N such that 

m > N =>• x rn e f~ l {U ) 


and thus 


m > N 


f(x m ) e U 


as required. □ 

Remark 3.6 The converses of these two theorems are generally false. To see that, it 
suffices to note that if X is an uncountable set endowed with the cocountable topology, 
then the only convergent sequences are the eventually constant ones, while every point 
x e X is a cluster point of any cofinite set S (i.e. one where where X — S is finite). 
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3.2.3 Second Countable and First Countable Spaces 

We present now two axioms out of a class of properties called countability axioms. 
The conditions we present are quite strong, and have pleasant consequences as we 
will see (particularly to the behaviour of limits and cluster points). While many 
topological spaces fail to satisfy either of the countability axioms we present, the 
class of those that do is broad enough to supply one with plenty of spaces that 
topologically are closer to one’s intuition than their wilder topological cousins are. 

Definition 3.15 A topological space X satisfies the second axiom of countability or 
is said to be second countable if there is a countable basis for the topology on X. 

Thus, second countability means that there exist a countable index set 7 and a family 
{Uihei of open sets in X such that any set U can be expressed as 

U = U U i 

jeJu 


for some J\j C I . 

Example 3.12 The space R with the Euclidean topology is second countable. To 
see that, we exhibit a countable basis for the topology, namely the collection 
{Z?i/„(g)}g 6 Q „>j. First, note that this is a countable collection, as it is indexed 
by the set Q x (N — {0}) (see, e.g., Sect. 1.2.13 of the Preliminaries). Next, given 
an open set U c R and a point x e U, there exists a S > 0 such that If (x ) C {/, 
and let n e N satisfy 1/n < S/2. Since the rationals are dense in the reals, we 
may find a rational number q for which \x — q\ < 1/n. It now easily follows that 
x e B\/ n (q) C Bg(x) C U, as required by the definition of basis. 

Definition 3.16 A subset S C X in a topological space is said to be dense in X if 
S = X. The space X is called separable if it contains a countable dense subset. 

A familiar example of a separable topological space is R with the Euclidean 
topology. Indeed, it is easily verified that the countable set Q is dense in R. In fact, 
R is also seen to be separable by the following general result. 

Theorem 3.8 Every second countable space X is separable. 

Proof Let 38 be a countable basis for the topology. For each B e 38 choose an 
arbitrary point xg e B (if B = 0, simply skip it). The set S = {xb \ B e 38} is 
clearly countable so it remains to show that it is dense. Indeed, fix a point x e X. To 
show that x is in the closure of S suppose x f S and let U be an open set with x e U. 
We now need to show that U fl (S — {x}) ^ 0. Indeed, by definition, there exists 
a basis element B with x e B C. U . But then x/, must be found in U fl (S — {.*}), 
establishing the claim. □ 

Many spaces of interest are not second countable but they do satisfy a (considerably) 
weaker condition known as the first axiom of countability. Such spaces, as we will 
shortly see, exhibit a general behaviour that in some respect is quite close to what 
our intuition about convergence and closures in R dictates. 
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Definition 3.17 A topological space X satisfies the first axiom of coun tability or is 
said to be first countable if there exists, at every point x e X, a countable local basis. 

Evidently a second countable space is also first countable, but not, generally, vice 
versa. 

Recall that the converses of theorems 3.6, and 3.7 are not generally true in arbitrary 
topological spaces. For first countable spaces however, we have the following pleas- 
ing result. 

Theorem 3.9 Let X be a first countable space, that is, at each point xo e X there 
exists a countable local basis fi$ xo = l tit xo. 

1. Fora subset S C X, a point xo e X is in S if, and only if s m — > xo fora sequence 
{ $m }m > 1 hi S. 

2. A subset F C X is closed if and only if, F contains all limit points of sequences 
in it. 

3. A function f : X — ► Y to any topological space Y is continuous if, and only if it 
preserves limits of sequences, i.e., 

x m -» *o => f(x m ) -* fix o) 

for all sequences {x m } m >i in X. 

Proof 

1. By Theorem 3.6, every limit point of a non-eventually constant sequence in S is 
a cluster point of S, and thus is in S. The limits of eventually constant sequences 
are, of course, in S and thus also in S. Suppose now that xo e X is in S and, of 
course, the interesting case is when xo f S. In that case xo is a cluster point of 
S. Therefore, for every m > I the open set B x ° fl - • • fl ififi intersects S — {xo} 
non-trivially, and let s m be an arbitrary element in the intersection. Clearly the 
sequence [s m } m >\ is in S and its limit is xo- Indeed, given any open set U C X 
with xq e (/, there exist a basis element B x ^ with xo e B'f C IJ . But then 

rn > N => s m e B x fi c B X N ° c U. 

2. By Corollary 3.1, F is closed if, and only if, F — F, and we just proved that the 
elements of F coincide with limit points of sequences in F. 

3. By Theorem 3.7, if / : X —> Y is continuous, then it preserves limits of 
sequences. Suppose thus that f : X —> Y preserves limits of sequences, and our 
aim is to show that / is continuous. We need to show that f~ l (U) is open when- 
ever U c Y is open, or (equivalently!) that f~ l (F) is closed whenever F c Y is 
closed. It suffices to show that f~ l ( F ) contains its limit points, so suppose that 

Xm ^ *0 

in X with {x m } m >i a sequence in f~ l (F), and we will show that xo e f~ l (F). 
Indeed, / preserves limits of sequences and thus /(x m ) — »■ fix o), but then /(xq) 
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is a limit point of a sequence in F, and since the latter is closed, Theorem 3.6 
implies that f(x o) e F, showing thatxo e f~ l (F). □ 

Exercises 

Exercise 3.11 Prove that the boundary 9 (.S') of any subset S of a topological space 
X is a closed set. 

Exercise 3.12 Prove that 3 S — d(X — S ) for any subset S of a topological space X. 

Exercise 3.13 Prove that a subset S of a topological space X is closed if, and only 
if, it contains its boundary. 

Exercise 3.14 Prove that a set S of a topological space is open if, and only if, it is 
disjoint from its boundary. 

Exercise 3.15 Prove that S = S for all subsets S of a topological space X. 

Exercise 3.16 Find an example of a topological space X and a non-empty subset 
S C X such that 

int(S) = ext(S) = 0. 

Exercise 3.17 Let X be an uncountable set endowed with the cocountable topology. 
Show that the only convergent sequences in X are the eventually constant ones, while 
for every cofinite set .S' C X (i.e., A" — .Sis finite) S' — X. 

Exercise 3.18 Construct a first countable space that is not second countable. 

Exercise 3.19 Prove that a countable first countable space is second countable. That 
is, if X is a countable set endowed with a first countable topology, then it is in fact a 
second countable topology. 

Exercise 3.20 Prove that if A is a second countable topological space, then |rjf|, 
the cardinality of tx, is at most |R|, the cardinality of the real numbers. Does the 
converse hold? Does the same result hold if second countability is replaced by first 
countability? 


3.3 Constructing Topologies 

In this section we discuss two ways of constructing topologies. The first situation is 
where one has an arbitrary set and a collection of subsets of it that one would like to 
have as the open sets in a topology on X. The second scenario concerns a collection 
of functions one would like to force to be continuous. This construction is used to 
obtain products, coproducts, and quotients for topological spaces. 
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3.3.1 Generating Topologies 

We have seen above that it is at times convenient to have a manageable basis for 
a topology since one can effectively work with the basis elements rather than the 
entire topology. Given an arbitrary set X, without any prior notion of topology, and 
a collection SB of subsets of X, it is natural to ask whether there exists a topology r 
on X for which SB is a basis, and, if such a topology exists, is it unique. The answers 
are quite decisive. 

Theorem 3.10 (Basis Generating Topology) A collection SB of subsets of a set X is 
a basis for a topology r on X if and only if, the following conditions are met. 

1. For every x e X there exists a B e SB such that x e B. Equivalently, 

x = U B - 

B&SS 

2. For all II \ , Ih £ SB, if x e /(| fl lh, then there exists B 2 e SB such that 

x e B 3 c Bi n B 2 . 

Moreover, when the conditions are met, the topology r is unique, namely r is the 
smallest topology on X such that every B e SB is open. 

Proof Let us call a subset U C X open if for every x e U there exists a B e SB 
such that 


x e B c U. 

Equivalently, U C X is open if U is a union of elements from B. We show that 
the collection r of all such open sets forms a topology on X. The empty set 0 is 
vacuously open since it has no points, and the set X itself is open by condition 1 . 
Stability under arbitrary unions is straightforward, so let us establish stability under 
finite intersections. Let U \ , U 2 C X be open and we need to show that U = U\ fl U 2 
is open. Indeed, if x e U , then x e IJ\ and thus there is a B\ e SB such that 
x e B\ c JJ\. Similarly, there is a B 2 e SB with x e B 2 c U 2 - As x e B\ fl B 2 , 
condition 2 gives us an element B 2 e SB with 


x e B 3 c Bi n B 2 c U\ n U 2 — U, 


as needed. 

The claim that SB is a basis for r is immediate from the construction. As for the 
minimality claim about r, note that if x' is any topology on X for which all B e SB 
are open, then any U e r, being a union of elements from B, must also be open in 
x' . In other words, r C r'. The uniqueness of r now follows too. □ 
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We may now pose the exact same question as above, replacing basis by subbasis. 
Thus, given an arbitrary collection 23 of subsets of X, is there a topology r on X for 
which 2$ is a subbasis, and is this topology unique. 

Theorem 3.11 (Subbasis Generating Topology) A collection 23 of subsets of a set 
X is a subbasis for a topology r on X if and only if, 

U B = x - 

B<=3§ 

Moreover, in that case the topology r is unique, namely r is the smallest topology 
on X such that every li € 23 is open. 

Proof One may reduce the proof to that of the previous result by noting that the 
collection 


23' = {fil n • • ■ n B m I B x , . . . , B m e 23, m > 0}, 

consisting of all finite intersections of elements of 23, does satisfy the conditions 
of being a basis for a topology, and thus the previous result may be applied to 23' , 
giving rise to a topology r. In other words, we declare a subset U C X to be open 
if it is an arbitrary union of finite intersections of elements from 23. The details are 
left to the reader. □ 

The results above, and their proofs, provide one with powerful tools for creating 
topologies. Any topology constructed from a collection 23 by the above techniques 
is said to be generated by 23. 

The following quite general result is used often to produce interesting topologies. 
The result states, roughly, that given any collection of functions with a common 
domain or codomain, a canonical topology exists rendering all the given functions 
continuous. We remark that, in a sense, topological spaces exists as the servants 
of continuous functions; we define topological spaces since we are interested in 
continuous mappings. This result thus allows one to tailor a suitable topology from 
a given collection of functions. 

Proposition 3.11 (Topologies Induced by Functions) 

1. Let X be a set and {/; : X — >■ Xj}j £ j a non-empty collection of functions with 
each Xj a topological space. There exists then a unique smallest topology on X 
such that every f : X — > X, is continuous. 

2. Let X be a set and [f : Xi — »• X)j £ j a non-empty collection of functions with 
each Xj a topological space. There exists then a unique largest topology on X 
such that every /,• ; Xj — > X is continuous. 

Proof 

1 . Suppose that r is a topology on X such that each f : X — > Xj is continuous. 
That means that f.~ 1 ( U ) e r for each U e Tx t ■ In other words r 3 23 where 
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SB = {fr\U) \i 6/,1/en,.) 

and so we see that we must define r to be the topology generated by SB. This is 
feasible since SB is easily seen to satisfy the condition of Theorem 3.11. 

2. Suppose that r is a topology on X such that each /,■ : X, X is continuous. 
That means that r can not contain any subset S C X for which there exists an 
i e 1 with r Xj . So, to obtain the largest possible topology on X, we 

consider the collection 

r = {KI fi~\S) e r Xl for all / e /}. 

If we can show that r is indeed a topology, then the proof is complete. Notice that 

T = n{ScX|/r 1 (S)er x ,.} 

iel 

and that (the reader is invited to prove) the intersection of a family of topologies 
on X is again a topology on X. Thus we only need to establish that the collection 
{ .S' c X \ 1 (.S) e rx t } , for a single i e /, is a topology. This follows 

immediately from well-known properties of the inverse image function, as the 
reader may verify. □ 

When a topology r is constructed as above, we say that it is generated by the given 
collection of functions. We note at once that it is quite possible that the collection of 
functions consists of just one single function. 

Remark 3. 7 Definition 3.11 of the subspace topology on a subset Y of a topological 
space X is already an example of this method of constructing topologies. Indeed, 
the subspace topology is precisely the topology generated by the single function 
i : Y X, the inclusion function (see Proposition 3.8). 


3.3.2 Coproducts, Products, and Quotients 

The rest of this section utilizes the constructions above in three particular scenarios. 

Definition 3.18 Let X and Y be topological spaces and assume X fl Y = 0. The 
topology onXUf generated by {i x : X — »■ X U Y, iy : Y — »■ X U Y], where i x 
and iy are the inclusion functions, is called the coproduct topology on X U Y, and 
the space X U Y is called the coproduct of X and Y . 

Examples are abundant, but some care is required in order to hone the intuition. For 
instance, in R with the Euclidean topology, the two subspaces Q and R — Q are 
clearly disjoint, so one can form their coproduct which would give a topology on R 
which is not the Euclidean topology. 
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The situation easily generalizes to infinitely many spaces, as follows. Suppose 
{X,\ i e i is a non-empty collection of topological spaces that are pairwise disjoint. 
The coproduct of all the X, is then the set 

iei 

endowed with the topology generated by the family of inclusions { X ,■ — > A')/ e /. As 
an example, a topological space X is discrete if, and only if, it is the coproduct of its 
singleton subspaces. 

Definition 3.19 Let X and Y be two topological spaces, and consider the cartesian 
product XxF. The topology generated by the set of projections 

{jr x : X x Y X, jt Y : X x Y Y} 

is called the product topology on X x Y, and the space X x Y is called the cartesian 
product or simply the product of X and Y. 

This situation also naturally extends to the infinite case. If {X,-},- e / is a non-empty 
family of topological spaces, then their cartesian product is the set X, the set-theoretic 
cartesian product of the Xj , endowed with the topology generated by the family of 
projections { tt, : X — > Xj | 1€ /. 

Let us consider the case of a product of countably many spaces {X m } m >\. The 
set-theoretic cartesian product X = X\ x ■ ■ ■ x X m x ■ ■ ■ consists of all sequences 
x — (xi, , x m , . . .) with x m e X m for each m > 1. The projection rc m : X — »• X m 
is given by n m (x ) = x m . Following the definitions above, it follows that the product 
topology on X is the one generated by the sets of the form U i x ■ ■ ■ x U m x ■ ■ ■ 
where each U m is an open set in X m and U m = X m for all but finitely many m. 

Example 3.13 A familiar example is R", the n-fold product of the topological space 
R with itself, when given the Euclidean topology. For n > 2 the resulting product 
topology on R" is also called the Euclidean topology. Another example is M°° as a 
countable product of R with itself. 

The final construction we present is that of the quotient topology. Let X be a set with 
an equivalence relation on it. Recall (e.g.. Sect. 1.2.15 of the Preliminaries) that there 
is then the associated quotient set X/~ = {[x] | x e X} of all equivalence classes 
[jc], and the canonical projection n: X —> X/~ given by tt (x ) = [x]. When this 
situation is enriched by the presence of a topology on X, it is natural to seek out a 
topology on X /~ as well. With the tools we now have, this is immediate. 

Definition 3.20 Let X be a topological space and ~ an equivalence relation on the 
set X. The topology on X/~ generated by the projection function jt : X — > Z/~ is 
called the quotient topology on 2£'/~, and the space X /~ is called the quotient of X 
modulo 
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Example 3.14 Consider [0, 1] as a subspace of R with its standard topology. Define 
the equivalence relation ~ on [0, l]byx ~ y precisely when x = yorx,y e {0, 1}. 
The quotient space [0, l]/~ is homeomorphic to the space S 1 = {x e R 2 | ||x|| = 1} 
considered as a subspace of R 2 with the Euclidean topology. Informally, ~ glues the 
two ends of the interval [0, 1], resulting in a circle. 


Exercises 


Exercise 3.21 Let X be a set and {t; };<=/ a collection of topologies on X. Prove that 



is a topology as well. On the contrary, prove that the union of two topologies on X 
need not be a topology. 

Exercise 3.22 Let X be a set and 88 a collection of subsets of X satisfying the 
conditions of Theorem 3.11. Prove that the intersection of all topologies r with 
88 C r is precisely the topology generated by 88. 

Exercise 3.23 Prove that the collection {[a, b) \ a, b e R, a < b] is a basis for a 
topology on R, known as the lower-limit topology. When R is equipped with this 
topology it is called the Sorgenfrey line, denoted by R / . We will denote R with the 
Euclidean topology by Rg. Compare these two topologies on R and investigate the 
meaning of continuity of functions / : R/ — > Rf. 

Exercise 3.24 Let X be a topological space. Prove that X is discrete if, and only if, 
X is the coproduct of all its singleton subspaces. 

Exercise 3.25 Prove that for all topological space X, Y, Z, the spaces X x (Y x Z), 
X x Y x Z, and (X x Y) x Z are homeomorphic. Is there any situation in which 
these spaces are equal? 

Exercise 3.26 Prove or find a counterexample to the claim that for all topological 
spaces X and Y, any subspace S of X x Y is necessarily of the form of the topological 
product Sx x Sy for some subspaces Sx C X and Sy C Y . 

Exercise 3.27 Let X and Y be topological spaces and consider the relation on the set 
X x Y given as follows. Declare (x, y) ~ (x 7 , y') precisely when x = x' (obviously 
an equivalence relation). Prove that when X x Y is endowed with the product topology, 
the quotient space (X x T)/~ is homeomorphic to X. Can equality ever hold? 

Exercise 3.28 Consider R with the Euclidean topology. Prove that the collection 
{U i x • ■ ■ x U m x ■ • • | U m C R are open} is a basis for a topology on 

R°°. This topology is called the box topology. Of the box topology and the product 
topology on R°°, which one is finer? 
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Exercise 3.29 Let X be a topological space and ~ an equivalence relation on X. 
Prove that a set U C X/~ (which is a collection of equivalence classes, thus of 
subsets of X) is open in the quotient topology if, and only if, 

U M 

Wet/ 


is open in X. 

Exercise 3.30 Construct a quotient space that is homeomorphic to a torus in R 3 . 


3.4 Separation and Connectedness 

Separation properties of a topological space relate to the ability of the open sets to 
separate distinct points, a point from a set, or two disjoint sets. We only consider 
one such property, the Hausdorff separation property, which is the strongest of the 
separation axioms pertaining to the ability to separate distinct points. We then turn 
our attention to the notion of connectivity for a topological space, a notion that is 
somewhat more subtle than one might initially expect. 


3.4.1 The Hausdorff Separation Property 

If X is endowed with the indiscrete topology, then the open sets (of which there are 
only two) are, in a sense, blind to the individual points in the space. The open sets 
are simply not sufficiently refined to be able to distinguish between different points. 
This is an extreme situation of course and it goes against one’s intuition of how 
distinct points should (in some sense) behave. The following separation property 
immediately brings some of the familiarity back. 

Definition 3.21 A topological space X is said to satisfy the Hausdorff separation 
property, or simply to be Hausdorff, if for all distinct points x, y e X there exist 
disjoint open sets U and V such that x e U and ye V. Such open sets are said to 
separate x and y. 

Remark 3.8 The Hausdorff separation property is also commonly referred to as Tj, 
indicating its position in a hierarchy of several separation axioms, starting with To, 
the weakest one. We will not delve into this issue here. 

Example 3.15 M with the Euclidean topology is Hausdorff. Indeed, ifx y are real 
numbers, then setting 


one easily sees that 
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B s (x) 0 B s (y) = 0 

thus clearly exhibiting open sets that separate x and y. 

Example 3.16 An example of a non-Hausdorff space is given by the Sierpinski space 
§ = {0, 1}. Its two points clearly can not be separated, since the topology is given by 
{0, {1}, {0, 1}}. In fact, a finite topological space X is Hausdorff if, and only if, it is 
discrete. Indeed any discrete space is clearly Hausdorff, since any two distinct points 
x and y are separated by the open sets {x } and { v } . In the other direction, suppose X 
is finite and Hausdorff, and choose some x e X. For each yeX-{x(we may find 
disjoint open sets U y and V v such that x e U y and y e V y . Then, the intersection 

u= n u > 

yeX-{x) 


is an open set (this is where the finiteness assumption on X is used), and it contains 
x. Further, since U C U y , it follows that 

U n F v c U y n Vy = 0 

and thus U is an open set that contains x but does not contain any y e X — { x } . 
In other words, U = {x}, and thus we proved that {x } is an open set. Since x was 
arbitrary, it follows that X is discrete. 

One immediate consequence of the Hausdorff property is the following result. 

Theorem 3.12 A convergent sequence {x m } m >i in a Hausdorff space X converges 
to a unique limit. 

Proof Suppose that {x m } m >i converges to two different points x and y. There exist, 
by the Hausdorff property, two disjoint open sets U x and U y with x e U x and y e Y v . 
By Definition 3.14, there exist N x , N y e N such that 

m > N x => x m e U x 

m > N y => x m e Uy. 

But then, for m > max { N x . N y }, we have that x m e U x O U y , a contradiction since 
U x and U y are disjoint. □ 


3.4.2 Path-Connected and Connected Spaces 

The second property we discuss is that of connectedness which, to immediately 
dispel any misconception, is not opposing the Hausdorff separation property (or any 
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other separation property). There are two notions to discuss — path connectivity and 
connectivity — the former more immediately intuitive than the latter, and thus is the 
one we present first. 

Definition 3.22 A space X is said to be path-connected if for all points x, y e X 
there exists a continuous function y : [0. 1] -* X with y( 0) = x and y (1) = y. 

Here the topology given to the interval [0, 1] is the subspace topology induced by 
the Euclidean topology on R. A continuous function y : [0, 1] — > X is naturally 
thought of as a path in X with initial point y (0) and final destination y( 1 ) . Thus, X is 
path-connected if any two points in the space can be connected by a path in the space. 

Example 3.17 The space R" is path-connected for all n > 1. Indeed, given points 
x = (x\, . . . , x n ) and y = (yi, . . . , y n ), it is easily verified that y : [0, 1] — »■ R" 
given by 


y(t) = t(y i, . y„) + (1 - t)(x i,...,x n ) 

is a path connecting x and y. The same argument shows that any subset of R" of the 
form 


[«i, b\] x • • ■ x [o„, b n ], 


a product of intervals, is path-connected. In particular, any interval [a, b ] in R is 
path-connected. More generally, any convex subset of R" is path-connected. The 
reader is invited to verify that another example of a connected space is S = {x e 
R" | || x || = 1}, the unit sphere, viewed as a subspace of R" with the Euclidean 
topology, provided that n > 1 . 

An example of a space that is not path-connected is Q as a subspace of R with the 
Euclidean topology. In fact, no two distinct points x, y e Q can be connected by a 
path. To see that, one can show that any path y : [0, 1] Q is constant. However, 
we establish the same result by utilizing the concept of connectivity which we now 
turn to. 

Definition 3.23 A topological space X is said to be disconnected if there exists 
non-empty open sets U and V such that U fl V — 0 and U U V = X. A space X is 
said to be connected if it is not disconnected. 

Theorem 3.13 Any closed interval [a, b], as a subspace of M with the Euclidean 
topology, is connected. 

Proof Without loss of generality let us assume that [«, b] = [0, 1]. Suppose that 
[0, 1] is not connected. Then there exist non-empty open subsets U and V in the 
subspace topology such that U U V = [a, b] and U fl V = 0. Since a singleton 
set is not open in the subspace topology on [0, 1] it follows that not U nor V is a 
singleton set. In particular, we may assume that there exist a e U and b e V with 
0 < a < b < 1. The set S = {x e U \ x < b) is non-empty and bounded above, 
and so admits a supremum s, and notice that 0 < s < 1. 
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Now, since [0, 1] = UUV we must have that either 5 e t/ors e V . f f ,v e U then 
s < b and, U being open, there exists ana' e U with s < a' < b, contrary to the 
choice of s. On the other hand, if s e V then, V being open, there exist an s > 0 with 
(s — s, s + s) C V, again contrary to the choice of .v . The assumption of the existence 
of U and V as above thus leads to a contradiction, and so [0, 1] is connected. □ 

Remark 3.9 The intuition behind the concept of connectedness is perhaps better 
explained by referring to the word ‘samenhangend’, the term used for this property 
in the Dutch language. The literal meaning of ‘samenhangend’ is ‘hanging together’, 
and it is this topological quality that the definition of connectedness captures. For 
instance, it is intuitively clear that if a space is path-connected, then it must be 
‘hanging together’ . This is indeed the case. 

Lemma 3.1 If a topological space X is path-connected, then it is connected. 

Proof Suppose that X is disconnected. That means that there exist disjoint non-empty 
open sets U, V C X such that 

UUV = X. 


Choose points x e U and y e V and, utilizing the assumption that the space X is 
path-connected, let y : [0, 1] -* X be a path from x to y. Since y is continuous, it 
follows that y~ l (U) and y~ l (V) are open sets in [0, 1]. Further, 

y~ l (U) n y~\V) = y~\u n V) = y _1 (0) = 0 


and 

y~\U) U y~\V) = y~\u U V) = y~\X) = [0, 1], 

Lastly, since x = y( 0) and x e U it follows that 0 e y~ l (U), and in particular 
y~ l (U) ^ 0. Similarly one shows that y~ l (V) 0. 

The conclusion is thus that y~ l (U ) and y~ l (V) separate [0, 1], a contradiction 
with the fact that [0, 1] was shown to be a connected subspace of R. We conclude 
that X is connected. □ 

Example 3.18 We now return to the example of Q and show that it is disconnected, 
and thus also not path-connected. Indeed, let a be an arbitrary irrational number and 
let C/ = {x e Q | x < a} and V = {x e Q | x > a}. Clearly, U fl V = 0 and 
UUV = Q, and thus all that is left to show is that U and V are open. Indeed, given 
any x e U, setting S = (a — x)/2, one easily sees that (x — S, x + S) fl Q C U. But 
this shows that U is open in the subspace topology on Q, as required. The claim that 
V, too, is open follows similarly. 

Theorem 3.14 Iff : X — > Y is a surjective continuous mapping between topolog- 
ical spaces and X is connected, then so is Y. 
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Proof Suppose that Y is disconnected, namely that Y — U U V for some non-empty 
open sets U,VC.Y which further satisfy U fl V = 0. Note that 

X = f~\Y) = f~\U)U f~\V) 

and that both and f~ l (V) are open in X. Further, 

f~\u) n f~\v) = r\u n v) = /- 1 (0) = 0. 

Since X is connected, we may conclude that either f~ l (U) or f~ l (V) is empty. But, 
since / is surjective, that is impossible and we conclude that Y is connected. □ 

Exercises 

Exercise 3.31 Find a topological space X which is not Hausdorff, and a quotient of 
X which is Hausdorff. Also, find a topological space X which is Hausdorff, and a 
quotient of X which is not Hausdorff. 

Exercise 3.32 Call two points x, y in a topological space X inseparable if for all 
open sets U , x e U •<=>■ y e U. Prove that inseparability is an equivalence 
relation on X. Is the quotient space X/~ Hausdorff? 

Exercise 3.33 Given topological spaces X and Y , prove that X x Y is Hausdorff if, 
and only if, each of X and Y is Hausdorff. 

Exercise 3.34 Prove that a topological space X is Hausdorff if, and only if, the 
diagonal set {(x, x) \ x e X} is closed in the product space X x X. 

Exercise 3.35 Prove that each of the Hausdorff separation property, connectivity, 
and path-connectivity is a topological invariant. Use a connectivity argument to show 
that R and R 2 , each with the Euclidean topology, are not homeomorphic. (Hint: 
improve some of these topological invariants.) 

Exercise 3.36 Let A be a topological space, and S C X a subset. Prove that if S (as 
a subspace) is connected, then so is its closure S. 

Exercise 3.37 Characterize all of the connected subsets of R with the Euclidean 
topology. 

Exercise 3.38 Prove the following Intermediate Value Theorem. Let f : X —> R 
be a continuous function from a topological space X to the topological space R with 
the Euclidean topology. If X is connected, then for all x, z e X and c e R with 

f{x) < c < f(z), 


there exists a y e X with 


f(y) = c. 
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Exercise 3.39 Prove that if X and Y are connected (respectively path connected) 
topological spaces, then so is X x Y. 

Exercise 3.40 Prove that a quotient space of a connected space is connected. 


3.5 Compactness 

The reader is probably aware of the importance of closed intervals in R and, more 
generally, of closed and bounded subsets of M or M". For instance, the Bolzano- 
Weierstrass Theorem states that every infinite sequence in a closed interval [a, b ] 
admits a convergent subsequence, or the theorem that a continuous function on a 
closed interval is uniformly continuous. It turns out that the crucial property of these 
subsets allowing for the mentioned results is a topological one, namely that these 
subsets are compact. The aim of this section is to present the notion of compactness 
and prove several results leading to the characterization of the closed and bounded 
subsets of R (under the Euclidean topology) as precisely the compact sets. The section 
concludes with the statement of the important result, known as Tychonoff’s Theorem, 
that the product of any family of compact topological spaces is itself compact. 

Definition 3.24 Let X be a topological space and CCI a subset. An open covering 
of C is a collection {£/; }, € ; of open sets whose union contains the entire set C, i.e. 

CC 

iel 

The set C is said to be compact if given any open covering {t/; }/ e / of C, there exists 
a. finite subcovering of C. That is, there exists finitely many sets Uj m from 

the given covering such that 


c c u h n---nu im . 

Notice that C may coincide with the set A, so we may speak of the space X itself 
being compact. 

Example 3.19 Any finite set C in any topological space is compact. This is clear 
since any open covering of a finite set must admit a finite subcovering, using at most 
as many open sets from the covering as there are points in the set C. 

Example 3.20 Any unbounded set fcR, when R is given the Euclidean topology, 
is not compact. To see that, consider the open covering 

OO 

Y c U(-k,k). 

k= 1 
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Clearly, no finite subcovering suffices to cover Y precisely because Y was chosen to 
be unbounded. In other words, any compact set in M must be bounded. Similarly, an 
open interval (a, b) is not compact in R either, since the open covering 



k=\ 


admits no finite subcovering of (a, b), as is easy to verify. 

Example 3.21 For any set X, if X is endowed with the indiscrete topology, then 
there are only two open sets (or sometimes less) in existence and so every subset 
C C X is certainly compact. If, however, X is given the discrete topology, then every 
singleton set {x} is open, and thus any subset C C X admits the open covering 


c = U w 


and thus is compact if, and only if, C is finite. 

The next example is important enough to be stated as a theorem. 

Theorem 3.15 A closed interval [ a , b] in M with the Euclidean topology is compact. 

Proof Suppose that an open covering { C/, }, e / of [ a , b] is given, and the goal is to 
prove a finite subcovering exists. Define the set S of all a < s < b for which a finite 
subcovering of [a, 5] exists, and thus the goal now is to show that b e S. Now, since 
[a, a ] = {a}, which can certainly be covered by just one element from the given 
covering, it follows that a e S, and thus S is not empty. By definition, S is bounded, 
and thus has a supremum t. We will show that t — b. Firstly, take some U lti from the 
given covering for which t e t/; 0 . As t/, 0 is open, we have 


t G ( t — £, f T £) C UjQ 


for some e > 0, and since t — s < t it follows that t — s G S. There is then a finite 
subcovering of [a, t — £], and by adjoining t/; 0 to it we obtain a finite subcovering 
of [ a , t], and thus t G S, and thus a finite subcovering for [a, r] exists. But then, if 
t < b, then the finite subcovering of [«, f] is (by considering the set containing t ) 
also a finite subcovering of [0, t + <5] for a suitable S > 0, contrary to the minimality 
of 1 . We thus conclude that t = b and thus that one can extract a finite subcover of 
allof[fl,/?]. □ 

Generally speaking, a compact subset need not be closed, as the following example 
shows. 

Example 3.22 Let X be an arbitrary set and consider the cofmite topology on it. The 
open sets (other than 0) are those whose complements are finite, and thus we may 
identify the closed sets (other than X itself) as precisely the finite subsets. We now 
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show that any subset C C X is compact. Suppose an open covering {t/,}, e / of C 
is given. To extract a finite subcovering, consider any non-empty member U, of the 
covering, and note that since it is open it misses at most finitely many of the elements 
of C. Thus only finitely many other open sets from the given covering are required 
to cover C, and thus a finite subcovering exists. In particular then, if X is infinite, 
then not every compact subset of it is closed. 

Lemma 3.2 A closed subset C of a compact space X is compact. 

Proof Given an open covering of C, simply note that adding to it the set X — C, 
which is open since C is assumed closed, yields an open covering of X, which is 
compact. A finite subcovering of X thus must exist and discarding X — C from it if 
necessary yields a finite subcovering of C, as required. □ 

The compact subsets of R. with the Euclidean topology can be characterized as 
precisely the closed and bounded sets, as we proceed to show, starting with the 
following independently important result. 

Proposition 3.12 A compact subset C of a Hausdorff space X is closed. 

Proof If C = X, then C is closed, so we may assume X — C is not empty, and we 
proceed to show that it is open. Let xo e X — C be an arbitrary point. For every 
y e C, using the Hausdorff assumption on X, there exist disjoint open sets U y and 
V y such that xo e U y and y e V y . Clearly, the collection { V y } ye c is an open covering 
of C, and since C is compact there exists a finite subcovering V yi , ... , V yk such that 


C c v n u . . . u v n . 


Let 


u = u yi n ■ ■ • n u yk 


and note that U is open, that xo e U, and that U c C. In other words, xo is an 
interior point of X — C and since xo was an arbitrary point, every point of X — C is 
interior, and thus, by Corollary 3.1, X — C is open. □ 


Theorem 3.16 (Heine-Borel) With respect to the Euclidean topology on R, a subset 
C C 1 is compact if and only if C is closed and bounded. 


Proof Assume that C c f is compact subset. Since R is Hausdorff, it follows by 
Proposition 3.12 that C is closed while the fact that C must be bounded was already 
noted in Example 3.20. In the other direction, if C is bounded, then it is contained 
in some closed interval [a, b], which by Theorem 3.15 is a compact set. If C is also 
closed then it is a closed subset of the compact set [a, /;] and thus, by Lemma 3.2, is 
itself compact. □ 


The next result is a generalization of the Bolzano- Weierstrass Theorem. 
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Theorem 3.17 An infinite subset S of a compact topological space X has a cluster 
point in X. 

Proof Suppose that S does not have any cluster points in X. That means that for 
every x e X there exists an open set U x with x e U x and such that U x fl S is either 
empty or the set {x}. Clearly, {U x } x eX is an open cover of X, and thus admits a finite 
subcover, say U X] , , U Xjl , and thus 

SC U X1 U---U U Xn . 

But U Xi fl S contains at most one element, and we thus conclude that 5 < n, so S is 
finite. Consequently, if S is infinite, then it must have at least one cluster point. □ 

We end this section by mentioning the following important result. 

Theorem 3.18 (Tychonoff Theorem) The product of any number (finite or infinite) 
of compact topological spaces is itself compact. 

The proof of this result is slightly beyond the scope of this book. 

Exercises 

Exercise 3.41 Show that a subset of a compact space may be compact but may 
also be non-compact. Similarly, show that a subset of a non-compact space may be 
compact but may also be non-compact. 

Exercise 3.42 Prove, without appealing to Tychonoff ’s Theorem, that the product 
of any finite number of compact topological spaces is compact. 

Exercise 3.43 Prove that if C is a non-empty compact subset of R with respect to 
the Euclidean topology, then C admits both a minimum and a maximum. 

Exercise 3.44 

1. Prove that if / : X — > Y is a continuous function between topological spaces 
and C C X is compact, then /(C) is compact in Y. 

2. Prove the following Extreme Value Theorem. Let / : X — > R be a continuous 
function from a topological space X to the topological space R with the Euclidean 
topology. If X is compact and non-empty, then / attains both a maximum and a 
minimum, i.e., there exist points x m , e X such that 

f(x m ) < fix) < f (x M ) 

for all x e X. 

3. Deduce Weierstrass’ Theorem: a continuous function / : [a, b] -> R attains both 
a minimum and a maximum. 

Exercise 3.45 Prove that compactness is a topological invariant, namely if X and Y 
are homeomorphic topological spaces and one is compact, then so is the other. 
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Exercise 3.46 Prove that the coproduct of finitely many compact spaces is itself 
compact. Does the result remain valid for the coproduct of infinitely many compact 
spaces? 

Exercise 3.47 Prove that a quotient space of a compact space is itself compact. 

Exercise 3.48 Construct a topological space X with a non-empty proper subset 
which is both open and compact. 

Exercise 3.49 Show that if X is an uncountable set endowed with the cocountable 
topology, then X is not compact. What are the compact subsets of it? 

Exercise 3.50 A topological space X satisfies the finite intersection property if for 
any collection & of closed sets, if 


F\ n • • ■ l~l F m ^ 0 


for all Ei , ... , F m e JF, then 

rw 0 . 

Prove that a topological space X has the finite intersection property if, and only if, 
X is compact. 

Further Reading 

For a perspective on some of the ideas that led to the birth of topology see [4] . For a 
comprehensive introduction to topology, including most of the results in this chapter 
and far more, the reader is referred to [3, 5], For a remarkably unexpected usage 
of topology, illustrating its versatility, see Furstenberg’s proof of the infinitude of 
primes [1], For a collection of essays on various topological developments, including 
an essay devoted to topology and physics, see [2]. 
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Chapter 4 

Metric Spaces 


Abstract This chapter is concerned with the basic concepts of metric spaces and 
results that are most relevant to applications in Hilbert space theory. Assuming no 
prior knowledge of metric spaces, the definitions are given in detail and the relation 
to normed spaces made explicit early on. The main theorems proved are the Banach 
Fixed-Point Theorem, Baire’s Theorem, the equivalence between compact sets and 
complete totally bounded sets, and an account of completion. 

Keywords Metric space • Uniformly continuous function • Isometry • Banach fixed- 
point Theorem • Baire’s Theorem • Cauchy condition • Metric completion • Totally 
bounded space • Metric topology • Compact metric space 


The concept metric space was introduced by Maurice Rene Frechet in 1906 in 
his Ph.D. dissertation devoted to functional analysis. The axioms given by Frechet 
(which are almost identical to the ones we give below) form an abstraction of the 
notion of distance thus allowing for a unified treatment of numerous particular cases 
under a single formalism. The ubiquity of metric spaces in Mathematics and Physics 
is quite overwhelming but, naturally, this chapter is mainly concerned with examples 
and results most relevant in the context of this book. 

Section 4.1 contains the definition of metric space and explores some examples, 
primarily of normed spaces as metric spaces. Section 4.2 is concerned with the topol- 
ogy induced by the distance function, the associated concept of convergence, and 
some basic topological observations regarding the induced topology. Section 4.3 
considers two types of structure preservation for mappings between metric spaces, 
namely non-expanding mappings and uniformly continuous ones. Section 4.4 is con- 
cerned with the concept of completeness in metric spaces, the Banach Fixed-Point 
Theorem and Baire’s Theorem, and Cantor’s metric completion process by means 
of Cauchy sequences. Finally, Sect. 4.5 examines compactness for metric spaces and 
establishes the fact that a metric space is compact precisely when it is complete and 
totally bounded. 
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4.1 Metric Spaces — Definition and Examples 

Our intuitive understanding of the behaviour of distance, in the plane for instance, is 
clearly represented in the definition of metric space that we give below. Of particular 
importance is the simple fact that any normed space has a natural metric structure 
associated with it (though most metric spaces certainly do not arise in this manner). 
Consequently, the normed spaces encountered in Chap. 2 are all revisited here so that 
we may examine some of their metric properties. 

Before we proceed with the definition and with a list of relevant examples we note 
that for the general study of metric spaces there are technical advantages in allowing 
infinite distance between points. We thus first adjoin the symbol oo as an extended 
real number and denote by M+ = {x e R | x > 0} U {oo} the set of extended 
non-negative real numbers. 

Remark 4.1 We accept that x < oo for all real x (and we also accept that oo < oo). 
Addition is extended to include oo by setting x + oo = oo+x = oo for all extended 
real x > 0. 

Definition 4.1 Let A be a set. A function d : X x X — »■ R + is said to be a metric or 
a distance function on X if, for all x, y, z e X, the following properties are satisfied. 

1. Distance is symmetric, i.e., d(x, y) — d(y, x). 

2. The triangle inequality is satisfied, i.e., d(x, z ) < d(x, y) + d(y, z). 

3. Distance separates points, i.e., d(x, y) = 0 if, and only if, x — y. 

The pair (X, d) is then called a metric space and d(x,y) the distance between x 
and y. Quite often we refer to the metric space X without explicitly mentioning the 
distance function d. Moreover, when two (or more) metric spaces are considered, we 
often overload the symbol d to stand for the metric function in each space, relying on 
semantics to resolve any confusion. If needed, we may refer to the metric function 
of X by dx- 

Remark 4.2 There is quite a lot of flexibility in the choice of axioms above, giving 
rise to several structures that may resemble metric spaces to varying degrees. For 
instance, as noted above, it is possible that d (x, y) = oo, that is we allow the distance 
between points to be infinity. Some authors do not permit the distance function to 
attain the value oo. In some sense, the difference between these two possibilities is 
largely cosmetic. It is also possible to omit the separation condition, giving rise to 
what are known as semimetric spaces, and again, their theory is quite similar to that 
of metric spaces. If instead one neglects symmetry, giving rise to what are known as 
quasimetric spaces, the resulting theory is in many respects strikingly different than 
that of metric spaces. 

Observing that all of the axioms of metric space are universally quantified, it follows 
immediately that if S C X is a subset of a metric space X, then the restriction of the 
distance function d : X x X R + to the subset S x S is a distance function on S. 
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Definition 4.2 ( Metric Subspace) For a subset S of a metric space X the pair ( S , els), 
where d$ is the restriction of d to S x S, is called a metric subspace (or simply 
subspace if the metric context is clear) of X. 

There are various ways for a function / between metric spaces to interact with the 
metric structures, namely to preserve the distances to various degrees. The strictest 
preservation of distance gives rise to the following definition. 

Definition 4.3 A function / : X — »■ Y between metric spaces is said to be an 
isometry if 

d{f{x\), f(x 2 )) = d(x i,x 2 ) 

for all a'i , a 2 e X. If / is also bijective, then / is said to be an isometric isomorphism 
or a global isometry. If there exists an isometric isomorphism between X and Y , then 
we write X = Y and we say that X and Y are isometric. 

A property of metric spaces is said to be a metric invariant if whenever it holds for 
a metric space X it also holds for any other metric space isometric to X. 

The following theorem is a rich source of metric spaces. 

Theorem 4.1 IfV is a normed space, then defining d : V x V — > R+ by 

d{x, y) — \\x - y || 

endows V with the structure of a metric space. 

Proof Note first that the codomain of d is indeed R + since the norm is always non- 
negative (in this case oo is never attained as the distance between points). For all 
vectors x,y,z € V, symmetry follows by 


d(x, y) = ||x - y || = ||(-1) • (y - x)|| = | - 1| • ||y - x|| = ||y - x|| = d(y,x), 


while separation and the triangle inequality are just restatements of the conditions 
of positivity and the triangle inequality in the definition of normed space (refer to 
Definition 2. 15, and with the aid of Proposition 2.12). □ 

Example 4. 1 Each of the normed spaces (and in particular the inner product spaces) 
introduced in Chap. 2 gives rise to a metric space, and we obtain the following 
examples of metric spaces. 

The space 1R" , n > 1 , with distance given by 


d(x, y) 



known as the Euclidean metric on M". The space C ", n > 1, with distance given by 
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d(x, y) = ( X 

\k— 1 


I xk - yt I 


where we may immediately note that C" is isometric to R 2/1 with the Euclidean 
metric, a global isometry / : C" -* R 2 " is given by 

fix i + yi , x n + y n ■ i ) = Ui, yi, * 2 , y 2 , . . . , x n , y „ ). 

For all 1 < /? < oo, every pre-L p space, say of continuous real-valued functions on 
the closed interval [a, b], is a metric space of functions with distance given by 

a b \ !/P 

dt \x{t) — y(t)\ p J 

In particular, C ([a , b ] , M) supports infinitely many different metric space structures, 
one for each 1 < p < oo. The case p = oo, i.e., with the norm, gives rise to yet 
another distance function, namely 

d(x, y) — max | x(t) — y(t) \. 

a<t<b 

Similarly, the space C(7, C) of continuous complex-valued functions on a closed 
interval I — [a, b] supports an entire family of metric space structures. For each 
1 < p < oo the space l p of absolutely p- power summable sequences of either real 
or complex numbers with distance given by 


d{x, y) = ( > I xk - yk\ 


is yet another family of metric spaces. Of course the case p — oo, i.e., the space l 0 0 
of bounded sequences with distance given by 


d{x, y) = sup \xk ~ yk I 
k> l 


is a metric space as well. Beware here not to confuse the value oo for the parameter 
p with a limiting process. 

R" and C" support other metric structures, different than those presented above, as 
the following example shows. 

Example 4.2 If we consider R" as a subset of l p , with 1 < p < oo, by identifying 
x — (xi , , x n ) with Ui, . . . , x n , 0, 0, 0, . . .), then R" inherits the metric of l p 
and becomes a metric subspace of £ p . In more detail, if p — oo then 
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d(x, y ) = max \x k - y k \ 

\<k<n 


and otherwise 


d (x, y) 



If n > 1 and p ^ 2, then each of these metrics is different than the Euclidean metric. 
A similar observation gives rise to a family of metric structures on C" . 

The following examples illustrate the subtleties that may arise with general metric 
spaces. 

Example 4.3 Consider R 2 with the Euclidean metric, i.e., the plane with its ordinary 
Euclidean metric structure. The unit circle S — {x e R 2 | ||x|| = 1} has two natural 
metric structures on it. As a subspace of R 2 , the distance between two antipodal 
points in S is 2 (the length of the shortest straight line connecting the two points 
where the line is allowed to pass freely in the ambient space R 2 ). On the other 
hand, it is possible to endow the unit circle S with the so called intrinsic metric , the 
one where the distance between two points is the length of the shorter of the two 
arcs connecting them. With that metric the distance between antipodal points is n 
(the length of a shortest geodesic connecting the two points, that is a “straight” line 
connecting the points where the line is not allowed to step outside of S and into the 
ambient space). 

Other scenarios, perhaps more abstract, are also possible. For instance, given any 
set X , defining d(x, x) — 0 for all x e X and d (x , y) = 1 for all x, y e X with 
x ^ y turns X into a metric space. In this metric space the distances between any 
two distinct points is 1 . Thus, if X is infinite, then X is not isometric to any subspace 
of the Euclidean space R", for any n > 1. Notice however that defining d(x , y) = 0 
for all x, y e X fails to be a metric space (unless X has at most one element) since 
separation fails. It is however easily seen to be an example of a semimetric space. 

Example 4.4 The set R + can be given the metric structure 


d(x, y) 


\x — y | if x ^ oo ^ y 
0 if .r = y = oo 

oo otherwise. 


The metric axioms are easily verified. In fact, we will simply write <7 (x, y) = |x — y | 
since the algebraic properties of this metric significantly resemble those of |x — y| 
for real x, y. 

Finally, we present one of many ways to endow the cartesian product XxY of two 
metric spaces with a metric structure. 

Definition 4.4 (The Additive Metric Product) For metric spaces X and Y the function 
d : (X x Y) x (X x Y) — >■ R + given by 



118 


4 Metric Spaces 


d((x, y ), o', /)) = dx (x, x') + d Y (y, y') 


is easily seen to be a distance function (as the reader may verify) and the metric space 
(X x Y, d) is called the additive metric product of X and Y. 

Exercises 

Exercise 4.1 Let / : [0, oo) -> [0, oo) be a concave, strictly monotonically increas- 
ing function satisfying /( 0) = 0. Suppose that (X, d) is a metric space and that d 
only attains finite values. Prove that (X, / o d) is a metric space. 

Exercise 4.2 Let (X, dx) and (Y. dy ) be two disjoint (i.e., X PiY = 0) metric spaces 
and let Z = X U Y. Prove that d : Z x Z -» IR + given by 


d{zi,zi) 


dx(zi,Z 2 ) 
dy(zu Zi) 
oo 


if z i, Z2 e X 
if zi, z 2 e Y 
otherwise 


is a metric on Z. The metric space (Z, d) is then called the coproduct of X and Y. 
Generalize this construction to give the coproduct of any number (finite or infinite) 
of metric spaces. 

Exercise 4.3 Let X be a metric space. Define the relation ~ on X by x ~ y precisely 
when d(x, y) < oo. Prove that ~ is an equivalence relation. The equivalence class 
[x], for any point x e X, is called the galaxy of x. Show that each galaxy, as a metric 
subspace of X , has all distances finite and then show that X is the coproduct (see 
previous exercise) of its galaxies. 

Exercise 4.4 Suppose that (X, d) is a semimetric space and define the relation ~ 
on X by x ~ y precisely when d(x, y) = 0. Prove that ~ is an equivalence relation 
on X and that c/([.r], [ v] ) = d(x, y) on the quotient set X/~ is well-defined and 
endows it with the structure of a metric space. 

Exercise 4.5 Given any two metric spaces (X, dx) and (Y, dy), define the function 
dE : (X x Y) x (X x Y) — > R + by 


d E {{x\,y\), (X 2 , yi)) = y/d x (xi, x 2 ) 2 +dy(y i, y 2 ) 2 - 

Prove that (X x T, d E ) is a metric space, called the Euclidean metric product of X 
and Y. Generalize this construction to obtain the Euclidean metric product of any 
finite number of metric spaces. 

Exercise 4.6 Prove that every isometry is injective. 

Exercise 4.7 For all metric spaces X, T, and Z prove that 

1. X = X 
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2. If X = Y, then Y = X 

3. If X = Y and Y = Z, then X = Z. 

Exercise 4.8 Prove that the induced metric in a normed space V satisfies the equality 
d(x + z, y + z) — d(x, y) for all vectors x, y, z e V . Namely, the induced metric in 
a normed space V is translation invariant. 

Exercise 4.9 Consider the set R of all Riemann integrable functions / : [a , b] — > R 
and define d : R x R — ► K+ by d(f, g) = J ^ dt \f(t) — g(f)|. Is (R, d) a metric 
space? 

Exercise 4.10 A mid-point for two distinct points x, z in a metric space A is a point 
y e X such that d(x, y) — d(y, z) = d(x, z)/2. Give examples of: 

• A metric space where every two distinct points admit a unique mid-point. 

• A metric space where every two distinct points admit a mid-point, and some admit 
two distinct mid-points. 

• A metric space where no two distinct points admit a mid-point. 


4.2 Topology and Convergence in a Metric Space 

This section is concerned with the topological structure of a metric space. Every 
metric space has a topology associated with it which is determined by the distance 
function. A metric space is thus automatically a topological space and thus all of the 
concepts of topological spaces, including convergence, are meaningful in a metric 
space. After describing this so called induced topology we examine the concept of 
convergence in the metric spaces introduced above. It should be noted immediately 
that different metrics on the same set may induce the same topology. A topological 
space induced by a metric is called a metrizable space and we study some of the 
properties of such spaces. 

As motivation for the induced topology we follow a similar route to the one we 
took in Sect. 3.1 where the concept of topology was motivated as a means to capture 
continuity while avoiding arbitrary properties of the distance function. 

Before we proceed we note that the familiar notion of continuity of a function 
applies (formally verbatim) to arbitrary metric spaces. 

Definition 4.5 A function / : X — > Y between metric spaces is continuous if for 
all x e X and s > 0 there exists a S > 0 such that 

d(x, x) < S d(f(x ), f(x')) < e 


for all x' e X. 
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4.2.1 The Induced Topology 

The topology induced by a metric is given in terms of what are known as open balls, 
which are shown below to form a basis for a topology. It should be made clear at 
once that the resulting topology does not carry all of the metric information. In fact, 
it carries very little of it. There may be many radically different distance functions 
on a single set, all of which induce the same topology. Moreover, not every topology 
is necessarily induced by a metric. Thus, replacing a metric by the induced topology 
looses a lot of information and misses many topological spaces. However, it is often 
convenient to be able to forget information, especially when it is not really needed 
for the problem at hand. 

Definition 4.6 Let X be a metric space, x e X, and s > 0. The open ball with 
centre x and radius e is the set 

B e (x) = {y e X | d(x, y) < e}. 

Example 4.5 Consider the Euclidean metric on R". The open ball B. (x ) in R is the 
open interval (x — e, x + £), while in R 2 it is the interior of the circle with centre x 
and radius e, and in R 3 it is the interior of a ball with centre x and radius e. 

Obviously, in a general metric space an open ball need not look anything like a 
ball. Figure4.1 shows the loci of points in M 2 satisfying ||jc|| = 1 under the i p norm 
for different values of p. Note that the case p = 1/2 is not at all a metric and is just 
added here for the sake of completeness. 

We now show that the open balls in a metric space X form a basis for a topology. 

Theorem 4.2 The collection { B f (x) x e X, e > 0} of all open balls in a metric 
space X satisfies the conditions of Theorem 3.10. 

Proof Since x e B e (x) for all x e X and e > 0, it follows at once that the open balls 
cover all of X. Next, suppose that z e B £i (x) fl If 2 (y), and we will demonstrate that 
z e B s (z) c B £1 (x) D B £2 (y ) for e = minjei — d(x, z), £2 — d(y, z)}. Indeed, if 
d(w,z) < e, thenrf(u), x) < d(w, z) + d(z, x) < (ei — d(x, z)) + d(z, x) = si, and 
so w e B f l (x). and therefore B e (z) c B, t (x). Similarly, B, (z) c B l n (y). Noting 
that e > 0, the proof is complete. □ 

The following definition is now justified. 

Definition 4.7 Given a metric space (X, d), the induced topology on the set X is 
the topology generated by the open balls in X. In particular, all open balls in X are 
open sets, and a subset U C X is open in the induced topology precisely when for 
all x e U there exists an e > 0 such that If (x ) C U. Any topological space ( X , r) 
for which there exists a metric d on X inducing the given topology r is called a 
metrizable space. 

Remark 4.3 Unless otherwise stated, every metric space is silently endowed with the 
induced topology. Consequently, every metric space X is also a topological space. 
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Fig. 4.1 Points in R 2 satisfying ||jc|| = 1 with \\x\\ p = ^ \xk\ p , for different values of p 


Example 4.6 It is straightforward to see that the Euclidean distance d(x, y) = \x — y\ 
on R induces the Euclidean topology on R. One may also readily confirm that the 
discrete topology on a set X is induced by the distance function d : X x X — >■ R + 
given by d(x, y) = 1 if x ^ y and d(x, x) — 0, thus any discrete topological space 
is metrizable. Examples of non-metrizable spaces are easily constructed once a few 
basic facts about metrizable spaces are observed, as done below. 

Theorem 4.3 A function f : X — > Y between metric spaces is continuous in the 
sense of Definition 4.5 if and only if it is continuous with respect to the induced 
topologies. 

The proof is formally identical to the proof of Theorem 3.2 and is left for the reader. 
We now turn to examine some general properties of metrizable spaces. 

Theorem 4.4 Every metrizable space is Hausdorff. 

Proof Let X be a topological space whose topology is induced by a metric d. Given 
distinct points x, y e X let e — d(x, y)/ 2, and notice that e > 0. We claim that the 
open balls B e (x) and Iffy) separate x and y. Indeed, if z € Iffx) D B £ {y), then we 
would have dix, y) < d(x, z) + d(z , y) < 2s = d(x, y), which is absurd. □ 
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By Theorem 3.12, convergent sequences in Hausdorff spaces have unique limits, 
leading to the following familiar result stated for general metric spaces. 

Corollary 4.1 hi a metric space, a convergent sequence has a unique limit. 

Apart from being Hausdorff, every metrizable topology, as we will see, has other 
pleasant features influencing the behaviour of, e.g., sequences. It is worth-while to 
notice that it is the interaction of a metric space with R + (via the distance function), 
which causes some of the familiar properties of the real numbers to reflect back to 
any metric space. This is shown in the next result. 

Theorem 4.5 Every metrizable space is first countable. 

Proof Let A be a topological space whose topology is induced by a metric d, and 
let x e X. We need to exhibit a countable local basis at x. Indeed, consider the 
collection {B i/„(x) | n e N | , which is clearly countable. To show that it is a local 
basis, suppose that x e U for an open set U C X. Then there is an s > 0 with 
B s {x) c U, and thus if we take any n > 1/e, then B\j n (x) C B E (x) C [/, as 
required. □ 

Theorem 3.9 yields the following corollaries. 

Corollary 4.2 A point x in a metric space X is a cluster point of a subset S C X if 
and only if, x is the limit of a sequence of points from S. 

Corollary 4.3 A subset F of a metrizable space X is closed if, and only if, F contains 
all limit points of sequences in F. 

Corollary 4.4 (Heine’s Continuity Criterion ) A function f : X —*■ Y between two 
metrizable spaces is continuous if, and only if, 

x,n *0 => f(x,n) f (x o) 

for all sequences {x m } m >\ in X. 

By Theorem 3.8, a second countable space is separable. We now show that for metric 
spaces the converse holds as well. 

Theorem 4.6 A separable metrizable space X is second countable. 

Proof Let d be a metric inducing the topology on X. We are given the existence of a 
countable and dense set D = {;qt}/teN i n X and we need to exhibit a countable basis 
for X. In the proof of Theorem 4.5 we saw that the collection 

3$k = {B\/n(xk) | n e N} 

is a local basis at We now show that 

a= U 

fceN 
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is a basis for the topology of X, noticing that the result then follows since B, being 
a countable union of countable sets, is itself countable (see Sect. 1.2.13 of the Pre- 
liminaries). 

Let U C X be an open subset, and choose x e U. There is then an e > 0 such 
that Bieix) C U. Choose n e N such that \/n < s. Since D is dense and B\/ n (x) 
is open, it follows that D D B\/ n (x) is not empty, and thus there is some m such that 
x m e B i/ n (x), in other words, d (x , x m ) < 1 /n. Now, the choice of n guarantees 
thatZ?i/„(x m ) C B 2 e(x ) c U, while the choice of x m guarantees that x e B\/ n (x m ). 
Noticing that B\/ n (x m ) e -X > m , the above establishes that for every open set U and 
x e U, there exists a B e 3d with x e B c u, showing that 3d is a basis for the 
topology on X. □ 


4.2.2 Convergence in Metric Spaces 

As remarked above, every metric space is in particular a topological space and thus 
has an intrinsic notion of convergence which we now turn to investigate for the main 
examples of metric spaces given above. 

Proposition 4.1 A sequence {x m }m>t in a metric space X converges to xo € X if 
and only if, d (x m , xo) — »■ 0 in M in the usual sense of convergence. 

Proof Suppose x m — »■ xo in the topological sense. Given e > 0 consider the open 
ball B e (x o), which is open in the induced topology. That means that there exists an 
N e N such that x,„ e B E (x o) for all m > N. In other words, d (x m , xo) < e for all 
m > N, showing that d (x m , xo) — > 0. The converse is left for the reader. □ 

Example 4. 7 Consider M" with the Euclidean metric 



For a sequence {x^ m, } m >i of elements in R" and a point x e R", the inequalities 

\x ( k m) — Xk\ < d(x (m) , x), 

valid for all 1 < k < n, show that x (m ^ — > x implies x^” 1 — > Xk, for each component. 
The converse is also true, (since there are only finitely many components), and thus 
convergence in the Euclidean metric in M” is equivalent to the usual convergence of 
the components in R. 

Example 4.8 Consider the space l p , p > 1 (see Sect. 2.5.4). Much as in the previous 
example, convergence in the i p norm implies the convergence of each of the com- 
ponents. However, as each vector now has infinitely many components, the converse 
does not necessarily hold. To see that, take the sequence \x (m> } m€ ^ where 
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X 


(m) 

k 


Each sequence {x|” ,) }„ !€ n is eventually constantly 0, and thus the only candidate for 
the limit of {x (m) \ m is the 0 vector. However, 

( OO \ 1/p 

Jj4 n) \ p j =i. 

and thus x^" !) yV 0 in the metric sense, and therefore {x (m - ) ) m >i does not converge 
at all in t p . The same sequence exhibits similar behaviour in l 0 0 too. 

Example 4.9 Consider the space C(7, K) of continuous real-valued functions on 
the closed interval I — [a . b ] , with the metric induced by the L 0 0 norm, i.e., for a 
sequence of functions {x m } m >\ and a function xq: 

x m -> xo max \x m (t) - xo(?)| -> 0. 

te[a,b] 

Thus, for any fixed s > 0 there exists an IV e N such that | x m (t) — xo(t)| < £ for 
all m > N and for all t e [a, b]. In other words, x m converges to xo uniformly on 
[a,b]. Since the converse is also true, as is easily seen by traversing the arguments 
backwards, we conclude that convergence in the L 0 0 norm is nothing but uniform 
convergence of functions on the interval [n, b]. 

Example 4.10 For 1 < p < oo consider the pre-L p space C (7, R), with I = [a, b], 
i.e., with distance given by 


d(x, y ) 


/ ° \ 
J dt |(x(f) - y(t) \ p 


i !p 


Convergence with respect to this metric is called L p convergence. It is well-known, 
from elementary considerations of the Riemann integral, that if x m -* xo uniformly, 
then x m also converges to xo in the L p norm. The converse is not generally true, as 
we exemplify for the L 2 norm. Consider the following sequence in C([0, 1], R): 

mt if 0 < t < — , 

, — — m ’ 


Xm it) — 
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X 



0 1/10 1/5 


t 


The computation 


1/m 



J dt (x m (t) — l) 2 = j dt ( mt — l) 2 = > 0 


o 


o 


shows that x m converges in Lj to the constant function x(t) = 1. However, point- 
wise, x m {t) converges to the discontinuous function x(t) = 1 for 0 < t < 1 and 
x(0) = 0. Since the uniform limit of continuous functions must be continuous, it 
follows that {x m ) m >\ does not converge with respect to the L 0 0 norm. 

Example 4.11 In fact, L p convergence need not even imply point wise convergence 
at any t in the domain. Indeed, consider C ([0, 1] , M) and, for simplicity, the L i norm. 
Let I m be the following sequence of intervals. The first interval is /q = [0, 1], the 
entire segment. The next two intervals are I\ — [0, 1/2] and I 2 = [1/2, 1]. The 
next three intervals are obtained by subdividing [0, 1] into three equal segments, 
the next four intervals are obtained by subdividing [0, 1] into four equal segments, 
and so on. Consider for each m the indicator function x m = yj m - These functions 
are not continuous, but let us ignore this for a while. It is clear that, with respect to 
the L\ norm, x m converges to the constant function 0, since the integral J ( ! dt/ m is 
the length of the segment that was considered in the m-th stage, and the segments’ 
sizes shrink to 0. However, for any 0 < t < 1, the sequence {Xm(t)}m> 1 contains 
a subsequence which is constantly 1 and a subsequence which is constantly 0, and 
thus does not converge. This example would thus serve our purpose, illustrating that 
L\ convergence need not imply point wise convergence at any point, except that 
the functions are not continuous. This can easily be remedied, with minor changes 
in the details above, by sacrificing a little bit of the segment used to define x m in 
order to add two linear segments to the graph of Xm so as to obtain the graph of a 
continuous function x m . This can be done in such a way as to only slightly affect 
the integrals so that the L 1 convergence is unaltered, while retaining the point-wise 
non-convergence. 


Exercises 


Exercise 4.11 Let X be a metric space, x e X, and e > 0. The closed ball with 
centre x and radius e is the set Il, (x) = {v e X \ d (x , y) < e}. Prove that the 
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closure of the open ball B E (x) with respect to the induced topology is the closed ball 
B e (x). 


Exercise 4.12 Show that in an arbitrary metric space X, the equality B E (x) = Bs(y) 
need not imply x — y nor e = 8. 


Exercise 4.13 

and s = 8 . 

Exercise 4.14 
Exercise 4.15 


Prove that if X is a normed space, then B e (x) = B$ (y) implies x = y 

Give an example of a metric space which is not second countable. 
Prove Theorem 4.3. 


Exercise 4.16 Let d\ and ch be two distance functions on a set X such that there 
exists real numbers a, f > 0 with 


ctd\(x, x') < dj(x, x ) < fid\ (x, x') 


for all x, x' e X. Prove that d\ and dj induce the same topology on X. 

Exercise 4.17 When is the cofinite topology on a set X metrizable? 

Exercise 4.18 Prove that the coproduct of two metrizable spaces is metrizable. How 
far does this result generalize, e.g., considering more than two spaces? 

Exercise 4.19 Prove that the product of two metrizable spaces is metrizable. How 
far does this result generalize, e.g., considering more than two spaces? 

Exercise 4.20 Find a metrizable space and a quotient of it which is not metrizable. 


4.3 Non- Expanding Functions and Uniform Continuity 

Given two metric spaces X and Y, the rich structure embodied in the metrics can 
be preserved to various degrees by functions / : X -> Y. In a sense the weakest 
preservation of structure one can impose is for / to be continuous while the strongest 
is for / to be an isometry. In this section we look at two intermediate conditions for 
structure preservation, namely non-expansion and uniform continuity. 

Definition 4.8 A function / : X — >■ Y between metric spaces is non-expanding if 

d(f(x), f{x') ) < d{x,x') 


for all x, x' e X. 

An important observation is that in any metric space (A, d) the distance function 
d : X x X —*■ R + is non-expanding. 
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Lemma 4.1 Let X be a metric space and consider X x X with the additive metric 
product structure ( Definition A A ), and R+ with the metric structure described in 
Example 4 .4. Then the distance function d : X x X — > M+ is non-expanding. 

Proof Given (x, x'), ( y , y') e X x X we may assume without loss of generality that 
dx(x, x') > dx(y , y')- Let us further assume that both dxix, x') and dxiy, y') are 
finite. It then follows that 

dR + (d(x. x'),d(y, /)) = | d x (x,x') - d x (y, y')l = d x (x,x') - d x (y , /). 
Since 

dxxx((x, x'), (y, y')) = dxix, y) + dxix', /) 
we need to show that 

dx (x, x') < dxix, y) + d x (y , y') + d x (y\ x') 

which is indeed the case by the triangle inequality. Treating the cases where either 
dx(x, x') — oo or dx(y, y') = oo is similar. □ 

Non-expansion is clearly a rather strong property, quite close in strength to being an 
isometry. Uniform continuity, which we now introduce, is closer to the other end of 
the spectrum of structure preservation and, in a sense, is only slightly stronger than 
continuity. 

Definition 4.9 A function / : X —*■ Y between metric spaces is uniformly contin- 
uous if for every e > 0 there is a 8 > 0 such that 

d(x,x') < 8 d(f(x ), fix')) < e 


for all x, x' e X. 

It is obvious that if / : X — »■ Y is non-expanding, then it is uniformly continuous, 
and that uniform continuity implies continuity. Neither of the converse implications 
is generally true. However, the reader is familiar with the fact that continuity on a 
closed interval implies uniform continuity. The crucial property here is that a closed 
interval is compact. Indeed, we now show that if X is compact, then continuity does 
imply uniform continuity. The proof we present relies on the topological concept 
known as the Lebesgue number of a covering, requiring the following preparatory 
concept which is part of the standard vocabulary in metric space theory. 

Definition 4.10 Let X be a metric space and XCKa subset. The diameter of S is 
the extended real number 


(which may be oo). 


diam(S) = sup {d(x, y)} 

x,yeS 
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For instance, if S = {x, y], then diam(S) — d(x, y). InR" with the Euclidean metric 
diam(B e (x)) = 2 s, for all x e R" and e > 0. More generally, in an arbitrary metric 
space diainf B,fx)) < 2s always holds and equality may fail. 

Lemma 4.2 (Lebesgue Number Lemma) Let X be a compact metric space. If 
{Uj}j e i is an arbitrary open covering of X. then there exists a 8 > 0 such that 
for all subsets S c. X with diam(S) < 8 there is an i e I with S C 

Proof For each x e X there is an i x e I with x e Ui x , and since U lx is open there 
is an e x > 0 such that B 2 e x (x) c {/,■ Clearly, the collection {B Ex (x)} xe x covers X, 
and by compactness there is then a finite subcovering 

m 

X=\jB Sxk (x k ). 

k=i 


We now show that 


8 = minfejj , . ..,Sk m ), 

which is clearly positive, satisfies the requirement of the lemma. Suppose then that 
S c X is a subset with diam(S') < 8, and we may assume S ^ 0, so let us fix an 
element y e S. Since y e X and X is covered by { B, k (.r)) \<k< n we may find an 
x — Xk 0 , with corresponding e = £* 0 , suc h that y e B e (x). It now follows, for any 
y' e S, that d(y, y') < diam(S) < 8 , and thus y' e Bs(y). However, 

Bs(y) C Iffy) C B 2e (x) c U x 

(as the reader is invited to verify) and thus y' e U x .As y' e S was arbitrary it follows 
that S c U; c, as required. □ 

A 8 as in the statement above is called a Lebesgue number for the given covering. 

Theorem 4.7 (Heine-Cantor) A continuous function f : X — > Y from a compact 
metric space X to an arbitrary metric space Y is uniformly continuous. 

Proof Given e > 0 consider, for each y e Y . the set 

Uy = r\Bs_(y)) 

which is open since / is continuous. Clearly, x e U f( x) for all x e X, and thus 
{GvIvsV is an open covering of X. We may now choose a Lebesgue number 8 > 0 
for that covering. Suppose that 


d(x, x') < 8 

for points x,x' e X. Then, diam({x,^'}) < 8 and thus {.v,.*'} C / _1 (5 e /2(y)) 
for some y e Y. That means that d(f(x), y) < s/2 and that d(f(x '), y) < s/2. 
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Applying the triangle inequality yields 

difix), < d(fix), y) + diy, fix')) < s 


as needed. □ 

Since uniform continuity always entails continuity, the result above implies that for 
a function / : X — »■ Y between metric spaces with a compact domain the concepts 
of continuity and of uniform continuity coincide. 

Exercises 

Exercise 4.21 Prove that the identity function id : X -> X on any metric space is 
uniformly continuous, and prove that the composition of two uniformly continuous 
functions is uniformly continuous. Repeat the exercise for non-expanding mappings 
instead of uniformly continuous ones. 

Exercise 4.22 For a function / : X — > Y between metric spaces show that if / is 
non-expanding, then / is uniformly continuous. Show as well that if / is uniformly 
continuous, then / is continuous. 

Exercise 4.23 Construct an example of a continuous function f : X Y between 
metric spaces that is not uniformly continuous, and an example where / is uniformly 
continuous, but not non-expanding. 

Exercise 4.24 Let / : R. -> R. be a differentiable function with \f'(x)\ < 1 for all 
ret. Prove that / is non-expanding, when considering the Euclidean metric. 

Exercise 4.25 Let A be a metric space and e > 0. Show that if dix, y) < s, then 
Iff y) c B 2 - £ (x). 

Exercise 4.26 Prove that if / : X — »■ Y is non-expanding, then diam( /'(,S')) < 
diam(S') for all .Sc X . Does the converse hold? 

Exercise 4.27 Prove that if / : X — > Y is uniformly continuous, then for every 
s > 0 there exists a S > 0 such that diam(/ (S)) < eforallS’ C X withdiarnfA) < 8. 
Does the converse hold? 

Exercise 4.28 Prove that for all subsets Si , S 2 of a metric space A, if ,S| C S 2 , then 
diam(Sj) < diam(S 2 ). Does the converse hold? 

Exercise 4.29 Prove that diam(B £ (x)) < 2s for all £ > 0 and for all points x in a 
metric space X. 


Exercise 4.30 Give an example of a metric space X, a point x e X, and £ > 0 such 
that the equality diam(B E (x)) = 2s fails. 
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4.4 Complete Metric Spaces 

This section is devoted to completeness in a metric space. A metric space X is said 
to be complete when every Cauchy sequence in it converges. Thus, in a sense, X is 
complete when every sequence in X that should converge actually does converge. 
The section starts with a discussion of the concept of Cauchy sequences in general 
metric spaces and then considers the completeness of R. The reader may already 
be familiar with the completeness of the real numbers, a property that is commonly 
stated in the form of the least upper bound principle: every non-empty bounded 
above set of real numbers has a supremum (for more details see Sect. 1.2.19 of 
the Preliminaries). We show below that the least upper bound property implies the 
metric completeness of R, namely that any Cauchy sequence in R converges (when 
R is given the Euclidean metric). We then examine the metric spaces introduced 
above, together with an investigation of their completeness. Next we establish two 
fundamental theorems of complete metric spaces, namely the Banach Fixed-Point 
Theorem and Baire’s Theorem. The importance of these results is difficult to over 
emphasize, particularly in the context of this book. As will be shown in Chap. 5, 
the Banach Fixed-Point Theorem can be used to iteratively solve certain differential 
equations. Some of the consequences of Baire’s Theorem in the theory of Banach 
spaces will also be given. This chapter then closes with a description of a completion 
process, i.e., a way to turn any metric space, complete or not, into a complete one 
while altering it as little as possible. The construction we present is that of Cantor’s 
using Cauchy sequences. 


4.4.1 Complete Metric Spaces 

Given a sequence {x m } m >\ in a metric space X , the statement 

3.ro e X Ve > 0 3 N e N Vm > N : d(x m , xq) < e 

is nothing but the claim that x m — »■ xq. By interchanging the first two quantifiers in 
the statement above we arrive at the statement 

Ve > 0 3 x £ eX, N e N Vot > N : d(x m , x £ ) < s 

which we call the global Cauchy condition. It expresses something considerably 
weaker than the existence of a limit. Namely, to establish convergence one needs 
to manufacture an appropriate N e N for any given e > 0 without altering the 
limiting point. With the global Cauchy condition on the other hand one is first given 
an e > 0, and may then manufacture the limit point x e , and a corresponding /V e N. 
Thus the global Cauchy condition expresses the existence of a sort of varying limit of 
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the sequence. In particular, if the sequence converges to xq, then the global Cauchy 
condition is satisfied simply by taking x e — X{). 

A simple analysis of the situation reveals that the x s can be entirely removed from 
the global Cauchy condition, as follows. Suppose that the global Cauchy condition 
is satisfied and let e > 0 be given. There exist then an x £ e X and an N e I I so that 
d(x m , x £ ) < e/2, for all m > N . It then follows that 

e e 

d(xk, x m ) < d(xk , x e ) + d(x £ ,x m ) < - + - — e 

for all k, m > N. Conversely, suppose that for all e > 0 there exists an /V e N such 
that 


d(xk, x m ) < s 

for all k, m > N. Then, taking x £ = xn, it follows that 

d (x m , x E ) = d(x rn ,XN ) < e 

for all m > N, and so the global Cauchy condition is satisfied. This discussion is 
summarized as follows. 

Definition 4.11 A sequence {x m } m >\ in a metric space X is a Cauchy sequence if 
it satisfies the Cauchy condition : 

Ve > 0 3N e N Vk, m > N : d{xk, x m ) < e. 

Equivalently, a sequence {x m } m >\ is a Cauchy sequence if it satisfies the global 
Cauchy condition 

Vs > 0 3.r g e X, N e N Vm > N : d(x m ,x £ ) < s. 

The familiar fact that every convergent sequence in R (when endowed with the 
Euclidean metric) is a Cauchy sequence is now nothing but the remark, made above, 
that the global Cauchy condition is trivially implied by the existence of a limit. The 
converse in R is well-known to be true (and we prove that fact below) but for general 
metric spaces the converse may fail. To see that, start with any convergent sequence 
x m — »• A'o in a metric space X satisfying x m ^ xq, for all m > 1, and consider 
the same sequence {x in the subspace X — {xol- Clearly, {x m } m >\ satisfies 
the Cauchy condition (indeed, it is a convergent sequence in X). However, it only 
converges in X since if it converged in X — {jco} too, say to the point y e X — {xo}, 
then it would also converge to y in X, contradicting uniqueness of limits in a metric 
space. 

Definition 4.12 A metric space X in which every Cauchy sequence admits a limit 
in X is said to be a complete metric space. 
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As promised above we now closely examine the completeness of the reals, which, 
as we will see, is a pivotal result. 

Example 4.12 ( The completeness of R) The space M with the Euclidean metric 
d(x, y) = \x — y| is a most fundamental example of a complete metric space. Recall 
(e.g.. Sect. 1.2.19 of the Preliminaries) that the real numbers R are defined to be any 
Dedekind complete ordered field. 

Given a Cauchy sequence {x m } m >\ of real numbers, we first note that [x m } m >\ is 
bounded by some number M, i.e., \x m \ < M for all m e N. Indeed, for e = 1 there 
exists an N e N such that \x k — x m \ < 1 for all k, m > N, and thus we found a 
bounded tail of the sequence, and thus the entire sequence is bounded. Let Sclbe 
the subset consisting of all reM such that the inequality x < x m holds for almost all 
m e N (meaning that it holds for all m e N with at most finitely many exceptions). 
Clearly, for our chosen upper bound M we have that — M e S and M 4 S, and thus 
S is non-empty and bounded above. By the least upper bound principle in R the set 
S admits a supremum L. We will proceed to show that L is in fact the limit of the 
given sequence. Given e > 0 let N e N with \x k — x m \ < e/2 for all k, m > N. In 
particular, taking m — N, we see that 

£ £ 

Xm ~ ^ < X k < X m + ^ 

holds for all k > N . These inequalities imply, by definition of 5, that 

£ £ 

x m e S, while x m H — 4. S. 

2 2 

Since L is the least upper bound of S it now follows that 

£ £ 

X m < L < X m H . 

2 _ - m 2 

Combining all of the inequalities above, it follows that 

£ £ 

\xk - L\ < \x k - x m | + \x m - L\ <- + -= £ 

for all k > N, proving that x m —*■ L. M is thus metrically complete. 

Remark 4.4 The amount of work put into this example is quite adequately justified 
by noticing that the completeness of R is used in establishing the completeness of 
all of the forthcoming examples of complete metric spaces. 

Remark 4.5 It should be noted that Dedekind completeness is a stronger property 
than metric completeness, in the following sense. While every Dedekind complete 
ordered field is metrically complete, there exist metrically complete ordered fields 
which are not Dedekind complete. None of these examples would be Archimedean 
though. These issues however are not particularly relevant to us here. 


4.4 Complete Metric Spaces 


133 


Example 4.13 In contrast to the situation with the real numbers, the metric space Q 
of rational numbers with d(x , y) = \x — y| is not complete. Indeed, by the density of 
Q in R, for every irrational a there is a sequence of rational numbers with x m -> a 
in R. Such a sequence is thus Cauchy in both R and Q but fails to converge in Q. 

Before examining the completeness property for the metric spaces introduced above, 
we remark that completeness is a metric invariant. In fact, we establish a slightly 
stronger result. 

Proposition 4.2 (Uniformly Continuous Functions Preserve Cauchy Sequences) If 
f : X — > Y is a uniformly continuous function between metric spaces and {x m } m >\ 
is a Cauchy sequence in X, then { f (x m )} m >\ is a Cauchy sequence in Y. 

Proof Given s > 0 let S > 0 be such that 

d(x,x') < S =>■ d(f(x), f(x')) < £• 

Since {x m } m >\ is Cauchy, there exists an IV e N such that 
k,m > N d(xk,x m ) < S. 

Combining the two implications yields 

k , m > N d(f(xk), f(x m )) < e, 

as needed for showing that {f(x m )} m > t is Cauchy. □ 

Corollary 4.5 Let f : X — > Y be an invertible continuous function between metric 
spaces such that f~ l : Y — * X is uniformly continuous. If X is complete, then so 
is Y. 

Proof Let {>’ m )m>i be a Cauchy sequence in Y . Then {/ -1 (y m )},„>i is a Cauchy 
sequence in X and since X is complete the sequence converges: f~ l (y m ) — > a'o- 
Since a continuous function preserves limits of sequences, we may now apply / to 
conclude that y m -> / (xq). □ 

Corollary 4.6 Completeness is a metric invariant. 

Example 4.14 (Completeness ofW 1 ) For the metric space R" with the Euclidean 
metric 


d(x, y) — 



1/2 


we already saw that x^ —> x if, and only if, x ( f' ) — »■ Xk for each 1 < k < n. To show 
that R" is complete, suppose that {x (m, },„>i is a Cauchy sequence. Then, for each 
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1 < k < n, the sequence {x^ n) } m >\ is also Cauchy (since — xi m < d(x, y )) 
and since R is complete, we conclude that xi ml — > Xk for all 1 < k < n , and 
therefore thatx ( "'i —> (xi, . . . , x n ). Thus, every Cauchy sequence in R" converges, 
as claimed. 

Example 4.15 ( Completeness ofC' 1 ) The metric space C" with distance given by 



is complete. It was already remarked that C" is isometric to R 2 ", and thus, since 
completeness is a metric invariant, the completeness of R 2 " implies that of C" . 

Example 4.16 For a closed interval I — \ a . b] the space C(/, R) of continuous 
functions on /, with the metric structure induced by the L 0 0 norm, is complete. To 
see that, recall that convergence with respect to this metric is uniform convergence of 
sequences of functions. If {x m } m >i is a Cauchy sequence of continuous real- valued 
functions on /, then one easily sees that for each t e [a, b] the sequence {x„, (?)},„> i 
is a Cauchy sequence in R with respect to the Euclidean metric. As R is complete it 
follows that x m (?) -> x(f), showing that {x m },„>i converges point-wise to a function 
x : [«, h] — » R, and in fact the convergence is easily seen to be uniform. To conclude 
we recall the well-known fact that if a sequence of continuous functions converges 
uniformly to a function x, then x is continuous. Thus, x e C( /, R) and is the uniform 
limit of the given sequence. 

Example 4.17 In contrast to the Loo case, for all 1 < p < oo the pre-L p space 
C([a, b], R) is not complete. For simplicity let us consider the case [a, b ] = [0, 1]. 
Let 


Xm (0 — 


1 


Then, for k > m 


if 0 < x < j , 

if 2 — x — \ + m’ 
if 


II Xm - x k || 


f dt | x„,(t) - f k (t)\ p < - 
J n 

1 

2 


and thus {x m } m >\ is Cauchy. However, if this sequence converged to a continuous 
function x : [0, 1] — > R, then 
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1 

2 

J dt | x(t) - x m (t) \ p < \\x - x m \\p -» 0 
0 

and thus x(t) = 1 for all 0 < t < 1/2. Similarly x(t) = 0 for all 1/2 < t < 1, but 
then x can not be continuous. 

Example 4.18 Going back to C(7, R), I — [ a,b ] and a < b, with the metric 
obtained from the norm (for which convergence is equivalent to uniform con- 
vergence), which was shown to be complete, consider the subspace P C. C (7, R) 
consisting only of the polynomial functions. As a subspace of the complete space 
C (7, R) the space P is not complete. To see that, recall that the polynomials 



*=o 


converge uniformly to the exponential function e' , which is not a polynomial. Thus 
the sequence {p m } m >\ is Cauchy (since it converges in C(I, R)) but it does not 
converge to a polynomial and thus does not converge in P at all (since if it would, 
then it would have two different limits in C (7, R)). 

Example 4.19 The space l p (of all absolutely p-power summable sequences of, say, 
real numbers) is complete. To see that, let {x (m> ] m >\ be a Cauchy sequence in t p . 
Since \x^ n) — x ( k "^ \ < \\x ( ' n) — x <n) || p , for all k > 1, it follows readily that {x^ m) } m >i 
is a Cauchy sequence in R, and thus it converges to a point jtp, e R. Thus 
converges point- wise to x = (x \, . . . , . . .). The completeness of i p will follow 

by showing that the convergence is also in the t p norm. Firstly, to verify that x e l p , 
note that since {x^} m >i is Cauchy, it is bounded, say by M. Now, for all K > 1 
and m > 1 



< ||x (m) || p < M. 


Allowing m to tend to oo, and then for K to tend to oo shows that ||x||^ < M, and 
thus x e i p . Finally, to show that x m —*■ x in the l p norm, let e > 0 be given. Let 
N e N with 


x (m) 


An) 


P 


< e 


for all m, n > N. We now have, for all K > I and m . n > N. that 
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< £ 


and thus, letting n tend to infinity, and then K tend to infinity, it follows that 


||x (m) — x\\p < s 


for all m > N, as required. 


4.4.2 Banach’s Fixed-Point Theorem 

A fixed-point for a function / : X — > X is a point xq e X such that f(x o) = xq. The 
importance of fixed-points lies in the ability to phrase the solutions of many important 
problems as fixed-points for suitable functions (this technique will be demonstrated 
in Chap. 5 ). 

The following metric condition will be shown to guarantee the existence of a 
unique fixed-point for a function / as above, provided X is a non-empty complete 
metric space. 

Definition 4.13 A function / : X —> Y between metric spaces is called a contrac- 
tion if there exists a real number a with 0 < a < 1 such that 

d(f(x t), /(x 2 )) < ad(x i,x 2 ) 


for all xi,X 2 € X. 

We note at once that a function need not be a contraction in order to posses fixed- 
points. An extreme example is the identity function id : X — > X on any metric space. 
It is not a contraction yet every point is a fixed-point for it. 

Theorem 4.8 (Banach Fixed-Point Theorem) If X is a complete non-empty metric 
space and f : X — > X is a contraction, then f has a unique fixed-point xq e X. 

Proof To prove the existence of a fixed-point, let x\ be an arbitrary element in X 
and consider the sequence xj — f(x t), X3 = /(x 2 ), and in general x m +\ = f(x m ). 
If this sequence converges to some limit xq e X, then, since continuous functions 
preserve limits of sequences, we will have that 


fix 0) = /( lim x m ) = lim f(x m ) = lim x m+1 = lim x m = x 0 

m—>oo m — >00 m->-o o m— >oo 


and so xo is a fixed-point of /. To show that the sequence does indeed converge, 
we employ the completeness of X and proceed to show that the sequence is Cauchy. 
Since the function / is a contraction, we know that there exists an 0 < a < 1 
such that 
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difix), f(y )) < ad(x, y ) 

for all x, v e X. We need to estimate the quantities d(xk,x m ) for the sequence 
defined above, and we notice first that for a 1 1 x , v e X, the triangle inequality yields 

d(x, y) < dix, fix)) + difix), fiy)) +d(f(y), y) 

< dix, f{x)) + ad(x, y) +d(f(y), y) 


and thus 


dix, y) < 


djx, fjx)) + djy, fjy)) 
1 — a 


which immediately shows that if a and y are both fixed-points of /, then d (x , y) = 0, 
and thus x = y, so the uniqueness clause is proven. 

A straightforward application of the latter inequality shows that 


dix k ,x m ) < 


a k + a m 
1 — a 


dix t, fix i)) 


immediately establishing that {x m ) m >\ is a Cauchy sequence. 


□ 


4.4.3 Baire’s Theorem 

The result we present below has numerous important implications in analysis in 
general and in functional analysis and Banach space theory in particular. Baire’s 
theorem can be interpreted as stating that a non-empty complete metric space can 
not be small when size is topologically measured as follows. The set Q of rational 
numbers is countable while R is uncountable and thus, in the sense of cardinality 
of sets, Q is negligible compared to R. However, from a topological point-of-view, 
since the closure of Q is all of R, the set Q is quite large. The opposite situation is 
that of a nowhere dense set, namely a set S whose closure has empty interior, and 
is thus a (topologically speaking) very small set. Sets which are countable unions of 
nowhere dense sets are called meager sets. The claim of Baire’s Theorem is that a 
non-empty complete metric space is never meager. 

Remark 4.6 Many of the definitions below can be stated in the greater generality of 
topological spaces. However, we keep the discussion rooted in metric spaces since 
the added generality is not of importance here. 

Definition 4.14 A subset S c X in a metric space X is said to be nowhere dense if 
the interior of its closure is empty. Equivalently, S is nowhere dense when S contains 
no non-empty open subsets. 
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Nowhere dense sets are, in a sense, very small. For instance, if x is not an isolated 
point, i.e., {x} is not open, then {a} is nowhere dense. It is obvious that a subset of 
a nowhere dense subset is itself nowhere dense, and that a finite union of nowhere 
dense subsets is itself nowhere dense. However, a countable union of nowhere dense 
subsets need not be nowhere dense. For instance, in R with the Euclidean metric no 
point is isolated and thus every singleton {x} is nowhere dense, and consequently 
every finite subset of R. is nowhere dense. However, Q (which is a countable union 
of singleton sets, and thus of nowhere dense sets) is not nowhere dense. In fact, it is 
everywhere dense, i.e., its closure is the entire space R. 

Definition 4.15 A subset S C X in a metric space X is said to be meager if it is a 
countable union of nowhere dense subsets. 

Clearly, every nowhere dense subset is meager, but not vice-versa (e.g., Q is meager 
in R but not nowhere dense). It is again left for the reader to establish that a subset of 
a meager set is itself meager, and that a countable union of meager subsets is meager. 

Since X is a subset of itself we may speak of X itself being meager. We may now 
state the main theorem. 

Theorem 4.9 (Baire) A complete non-empty metric space is not meager. 

Proof Suppose that X is a complete, non-empty, and meager metric space. Then 

OO 

X = U S >n 

m = 1 


where each S m C X is nowhere dense. By noting that the closure of a nowhere dense 
subset is itself nowhere dense, we may further assume (by taking closures) that each 
S m is closed. Since X is non-empty X is not nowhere dense, and thus, for all m > 1, 
one has that S m ^ X. 

It follows that X — Si is a non-empty open subset of X, and so there exists a point 
x\ e X and si > 0 with 


5 ei (.ri) C X - S\. 


Since S 2 does not contain any open subset, it certainly does not contain B : ] ( x ). Thus 
the set B ei (x 1) fl (X — S 2 ) , which is open, is not empty. Therefore, there exists a 
point X2 e X and £2 > 0 with 

Be 2 {X 2 ) C B bi ( xi ) fl (X - S 2 ). 

Continuing in this manner we obtain, for each m > 1, a point x m e X and s„, > 0 
such that 


B$ m+ i (x m +i ) — B f - m ( x nl ) Pi (X S m ) . 
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Notice that we may assume that e m < 1/m holds for all m > 1 , which readily implies 
that {x m } m >\ is a Cauchy sequence, which thus converges to a limit point xo € X. 
Our next claim is that the limit point xo belongs to B Em (x m ) for each m > 1 . Indeed, 
by the assumption that s m < 1/ m, it follows, for all k > m, that B Ek (xQ is contained 
in the closure of B Em+l (x m +i), which in turn is contained in B Em (x m ). Thus, a tail of 
the sequence is contained in the closure of B Em+l (, x m +\ ), which is closed, and thus 
the limit is contained in it too, and thus the limit is in B Em (x m ) . Now, since {S nl },„> i 
covers X, we must have xo € S mo for some mo > 1. But then, xq £ X — S mo and 
thus xo </ B £o +i(x m +t), a contradiction. □ 

Corollary 4.7 A complete non-empty metric space X with no isolated points must 
be uncountable. 

Proof If X were countable, then since 

x = U M 

xeX 


and each {x} is nowhere dense it would follow that X is meager. □ 

Corollary 4.8 The set R of real numbers is uncountable. 

Proof When endowed with the Euclidean metric, the space R is complete and has 
no isolated points. □ 


4.4.4 Completion of a Metric Space 

If X is a metric space which is not complete, then it is because there exist Cauchy 
sequences in it that fail to converge in X. Quite often though the space X is seen to 
be embedded inside a larger metric space which is complete. This is the situation 
for instance for Q as a metric subspace of R with the Euclidean metric. There are 
however other situation where it is not a-priori clear that the metric space in question 
embeds in a complete one. For instance, the space C (/, R) with the metric obtained 
from the L 2 norm is not complete (as we have seen) and it is not evident that it is 
embedded in a larger, complete, space. 

It is thus natural to ask if any metric space can be completed. That is, whether 
every metric space embeds in a potentially (much) larger complete metric space. Of 
course, one would have to set some expectations from such a completion, primarily 
that it is not too wasteful. In more detail, we are willing to accept that the completion 
may need to be larger than the original space, but we do not want it to be any larger 
than it must be. The following definition turns this rather vague formulation into a 
precise condition. 

Definition 4.16 Let X be a metric space. A complete metric space Y together with 
an isometry ( : X -> Y is said to be a metric completion (or simply a completion) of 
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X if l{X) is dense in Y. It is common to speak of Y as a completion of X, leaving 
i : X — > Y implicit, and it is also common to write X for a completion. 

Example 4.20 Clearly, the inclusion i : Q —> R. expresses R as a completion of Q, 
when both sets are endowed with the metric d{x, y) = \x — y\. 

The following theorem states that uniformly continuous functions with complete 
codomains can be extended uniquely to a completion of the domain. This is a very 
important property of metric completions. 

Theorem 4.10 Let f : X — » Y be a uniformly continuous function between metric 
spaces, with Y complete. If i : X — > X is a completion, then there exists a unique 
uniformly continuous function F : X —> Y which extends f , i.e., such that f = Foi. 

Proof We present here only the outline of the proof, inviting the reader to provide 
further details. 

1. To construct an extension F : X —> Y let xq e X be arbitrary and consider any 
sequence {x m ] m >\ in X with t(x m ) — > xo- This can be done since i(X) is dense 
in X. 

2. Since {i(x m )\ m >\ converges, it is a Cauchy sequence, and since i is an isometry, 
[x m }m > l is Cauchy as well. 

3. Since a uniformly continuous function preserves Cauchy sequences, it follows 
that the sequence \f ix m )] m >\ is Cauchy in Y. 

4. Since Y is complete, every Cauchy sequence converges, and thus f(x m ) — >• yo, 
for some yo e Y, and we denote Fix o) = yo- 

5. At this point one needs to verify that Fix o) is independent of the chosen sequence 
{x m ) m >\ ■ This can be done by showing that for any other sequence [x' m |,„>i in 
X with i{x' m ) -> x(j , 

d( lim f(x m ), lim f(x' m )) — 0 

k-t-oo k — ^oo 

(by appealing to the continuity of the distance function). 

6. The verification that F extends /, namely that F(i(x)) = fix) for all x e X, is 
immediate by considering the constant sequence {x m ] m >\, x m = x. 

7. One then needs to verify that F is uniformly continuous. Consequently, F is 
continuous. 

8. Finally, the uniqueness of F follows since any two continuous functions that agree 

on a dense subset, agree everywhere. □ 

Remark 4 . 7 The ability to extend a uniformly continuous function to a completion 
of the domain does not hold for arbitrary continuous functions. For instance, take 
the set S — {\/n \ n e N} as a metric subspace of R. under the Euclidean metric. 
Topologically, it is a discrete space and thus any function / : S — > Y, to any metric 
space Y , is continuous. Such a function merely corresponds to a sequence in Y . Now, 
let S = S U { 0 ) and notice that with the inclusion function i : S S, we obtain a 
completion of S. The space S is not discrete and indeed one can easily verify that 
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a continuous function F : S -> Y corresponds to a sequence in Y together with its 
limit point. In other words, while continuous functions / : S -> Y correspond to 
arbitrary sequences in Y, continuous functions F : S — » Y correspond to convergent 
sequences in Y . Clearly, there are plenty of example of complete metric spaces 
Y where not every sequence converges, and thus not every continuous function 
/ : S -> Y extends to a continuous function F : S Y. 

We now pose and answer two natural questions about completions of a metric space 
X, namely does a completion always exist and whether it is unique. We first address 
the question of uniqueness. Suppose that i : X —> Y is a completion. Given any 
bijection g : Y — > Y' , with Y' an arbitrary set, the metric structure of Y can be 
transferred to a metric structure on Y' by defining d : Y' x Y' -> R + by d (y [ , y' 2 ) — 
d{g~ x (y[), g~ l (Jt)), and then g is a global isometry. Informally, there is no essential 
difference between Y and Y' , and thus it is expected that Y' can also serve as a 
completion for X. This is indeed the case, as the reader can check that the composition 
g o l : X — >■ Y' is a completion. It thus follows that a completion is never unique 
(except when it is empty), but, more importantly, we are guided to ask a more refined 
question: Given two completions i : X — > Y and l' : X — > Y' for the same metric 
space X , are Y and Y' isometric? The affirmative answer is embodied in the following 
result. 

Theorem 4.11 Let X be a metric space. If i : X Y and i' : X — * Y' are 
completions of X, then there exists a global isometry p : Y — > Y' . 

Proof Since i' : X — > Y' is an isometry, it is in particular uniformly continuous, and 
thus t! extends uniquely to a uniformly continuous function p : Y —> Y' , satisfying 
that p o l — Y . Reversing the roles of t and t , the same argument yields a uniformly 
continuous function p' : Y' -» Y withp'oi' = l. Let us now consider p' op : Y —> Y, 
for which we have that p' o p o i — p' o i' — l. Thus, p' o p : Y -> Y is a uniformly 
continuous function which extends i. However, quite trivially, the identity function 
idy : Y -> Y satisfies idy o i — i, namely it too is a uniformly continuous function 
extending l. But, such an extension was proven above to be unique, and thus we 
must conclude that p' o p = idy. A similar argument reveals that pop' — idy/, 
and thus that p is bijective and p' is its inverse. Finally, due to the continuity of the 
distance function, for all yj , e Y, choosing sequences {a m } m > \ &n&{b m } m >\ with 
t(.a m ) -> }’i and i(b m ) -> y 2 , 

dy(yi,y 2 )= lim d Y (i(a m ), i(b m )) — lim d x (a m ,b m ) 

m— >oo m^-oo 


dr(p(y t), P(.V 2 )) = lim dy((p o (p o i){b m )) 

rn—^-oo 

= lim d Y '(i'(a m ), i(b m )) = lim d x (a m ,b m ). 

m—>oo m—>oo 


while 
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In short then, dy(y 1 , J 2 ) = dy(p(y 1 ), p(y 2 )), namely p is an isometry, and, as seen 
above, a global one. □ 

The result above shows that all completions of a metric space X are isometric and 
thus that the completion is unique up to an isomorphism. In light of the preceding 
discussion, we can not hope for anything more than that. 

We now turn to the existence of a completion of an arbitrary metric space X. 
Since a completion is only unique up to an isometry, and thus if one (non-empty) 
completion of X exists, then infinitely many different completions of X exist, it 
should not be surprising that there are in the literature different constructions of 
embeddings 1 : X — ^ X, yielding different (yet isometric) completions. The one we 
present is due to Cantor. The reader may wish to refer to the construction of the reals, 
outlined in Sect. 1.1, since it is closely related to the general completion construction 
below, only framed in a more familiar and less abstract setting. 

Let us fix a metric space X and construct a completion for it in a series of steps, 
leaving it to the reader to provide the proofs. If X is not complete, then there exists 
a Cauchy sequence {x m } m >\ which fails to have a limit in X. Intuitively, completing 
X entails adjoining to X formal limits for all such Cauchy sequences. If we denote 
the formal limit of [x m } m > \ by [{x m }], then we are led to define X as the union 

X U {[{x„,}] | {x m } m > 1 is a Cauchy sequence in X which fails to converge}. 

However, two different Cauchy sequences {x m ) m >\ and {y m } m > 1 may converge to 
the same point (for instance, there are many different sequences of rationals that 
converge to, e.g., \[X). and thus an equivalence relation needs to be introduced. 
There is also a simplifying observation, namely that we may identify an element 
x e X with the constant sequence \x m = x} m >\ (which is clearly Cauchy). Thus, 
instead of the union of X with the formal limits of limit-less Cauchy sequences, we 
expect a completion of X to have the underlying set 

X = {[{x„,}] | {x m } m > 1 is a Cauchy sequence in X} 

where now [{x, n }] denotes an equivalence class for a suitable equivalence relation 
on the set of all Cauchy sequences in X. The following steps realize this idea. 

1. Let X be the set of all Cauchy sequences in X and define cl : X x X — > R. by 
d( {x m } , { y m } ) — lim cl (x m . y m ) . 


2. Prove that the limit defining d((x m ), ( y m )) always exists. (Hint: The metric 
function d : X x X — »■ R + is uniformly continuous, and R is complete.) 

3. Prove that (X, cl) is a semimetric space. 

4. Recall Exercise 4.4 and let X be the metric space thus obtained from X . 
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5. Prove that ( X , d) is complete. (Hint: Avoid the temptation of taking the diagonal 
of a Cauchy sequence of Cauchy sequences. Instead, use the Cauchy condition 
whenever you can.) 

6. Note that for all x e X, the constant sequence {x m — x} nl >\ is Cauchy, thereby 
defining a function i : X — > X. Prove that it is an isometry. 

7. Prove that i(X) is dense in X. 

Remark 4.8 Any metric space X has a completion, and thus, if non-empty, X has 
infinitely many different (yet isomeric) completions. It is important to realize that 
there is no distinguished completion that deserves to be called the completion of X. 
Any particular construction of a completion for X can thus be seen as a model of the 
completion of X, whatever it may mean. Since isometric metric spaces are essentially 
the same it does not matter which model is used for the completion. However, some 
models may lend themselves to establish certain properties more easily than other 
models would. 

Exercises 

Exercise 4.31 For k > 1 and I = [a, h] a non-degenerate closed interval, show that 
the metric space C k (I, R) of functions with a continuous k-th derivative, with the 
metric induced by the L <*,, norm is not complete. 

Exercise 4.32 Is a subspace of a complete metric space necessarily complete? 

Exercise 4.33 Prove that a subset of a nowhere dense set is nowhere dense and that 
a finite union of nowhere dense sets is nowhere dense. 

Exercise 4.34 Prove that a subset of a meager set is meager and that a countable 
union of meager sets is meager. Is an arbitrary union of meager sets necessarily 
meager? 

Exercise 4.35 Is R — Q, the set of irrational numbers, a meager subset of R with 
respect to the Euclidean metric? 

Exercise 4.36 Consider the topological space Q as a subspace of R endowed with 
the Euclidean topology. Prove that Q is metrizable but not by any complete metric 
on Q. 

Exercise 4.37 Let {/,„ : R — > R} m >i be a countable family of continuous functions 
(with respect to the Euclidean topology on R). Suppose that for every point x e R 
there exist distinct functions fy, f m such that fk(x) — f m (-*)• Prove that there exists 
an interval (a , b), with a < b, and distinct functions fy , f m such that (x) = /„, ( x ) 
for all x e (a, b). 

Exercise 4.38 Let X and Y be metric spaces and consider respective completions 
ix '■ X -> X and iy : Y — > Y . Recall that the product of metric spaces can be 
metrized by either the Euclidean product metric or the additive product metric. Check, 
in each case, if the function ( : X x Y —> X x Y , given by i(x, y) = (ix(x), iy(y)), 
is a completion of X x Y. 
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Exercise 4.39 Let V be an inner product space considered as a metric space and let 
t : V —> V be a completion of it. Prove that the inner product on V extends to an 
inner product on V . 

Exercise 4.40 Let V be a normed space with its induced metric structure, and let 
l : (V, d) -> (V , d) be a completion of it. Prove that the norm || — || : V — * R. 
extends to a norm on V which induces the metric d . 


4.5 Compactness and Boundedness 

Compactness is a topological property and since, as we saw, any metric space is 
naturally endowed with a topology, compactness is meaningful in any metric space. 
The notion of completeness on the other hand is a metric property and does not 
have a topological counter-part. Completeness and compactness are not independent 
properties. The space R. (with the Euclidean metric) is complete but not compact, 
its subset [0, 1] is both complete and compact, and (0, 1) is neither complete nor 
compact. The final possibility though, as we show below, is implied; if a metric 
space is compact, then it must be complete. In this section we show the remarkable 
fact that compactness in a metric space is equivalent to completeness together with a 
property known as total boundedness. The latter is a purely metric property and thus 
the result shows that the topological property of compactness is equivalent (for metric 
spaces, of course) to the conjunction of the purely metric conditions of completeness 
and total boundedness. 

The reader is familiar with the notion of bounded sets in the Euclidean space R" . 
In a general metric space though there are two notions of boundedness, given next. 
These two notions coincide in the Euclidean spaces. Recall that for a subset S C X 
of a metric space X, its diameter is given by diam(S) = sup,. vsS {<f(x, y)}, which 
may be oo. 

Definition 4.17 A subset S in a metric space X is said to be bounded if its diameter 
is finite, A subset S is totally bounded if for every e > 0 one can cover S by finitely 
many sets of diameter smaller than e. 

The reader is invited to show that every totally bounded set is bounded, but that the 
converse need not hold. It is also evident that if S' C X C X and S is bounded 
(respectively totally bounded), then S' is bounded (respectively totally bounded). 
Since A is a subset of itself, we may speak of the metric space X as being bounded 
or totally bounded. 

The following result, whose proof is left for the reader, has an important corollary 
which is pivotal for the proof of the main theorem below. 

Proposition 4.3 The equality diam(.S') = diam (S') holds for all subsets S of a metric 
space X. 
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Corollary 4.9 Let X be a totally bounded metric space, °2/ — { U l } ie / an open 
covering of X, and e > 0. If X can not be covered by finitely many elements from 
, then there exists a non-empty closed set F C X with diam(C) < e that also can 
not be covered by finitely many elements from f/ . 

Proof Since X is totally bounded it can be covered by finitely many subsets of 
diameter smaller than e. By the preceding proposition, we may take the closures 
of these sets without altering their diameters and thus we may assume they are all 
closed. If each of these sets was covered by finitely many elements from , then 
(by taking the union) X itself would be covered by finitely many elements from , 
contrary to the assumption on X. Thus at least one of these closed sets is the desired 
one. □ 

Theorem 4.12 For a metric space X the following conditions are equivalent. 

1. X is compact. 

2. X is sequentially compact, i.e., every sequence {x m } m >i admits a convergent 
subsequence {x„ lk } k >\. 

3. X is complete and totally bounded. 

Proof We prove first that if X is compact, then X is sequentially compact, and so 
let {x m ] m >\ be a sequence in X. If the set S = {x m | m e N} is finite, then some of 
its elements must repeat infinitely often in the given sequence. In other words, the 
sequence contains a constant subsequence, which certainly converges. Otherwise S 
is an infinite subset in a compact topological space and thus, by Theorem 3.17, it 
admits a cluster point jto e X and by Corollary 4.2 this cluster point is the limit of a 
sequence of elements from S. 

Next we show that if X is sequentially compact, then it is complete and totally 
bounded. Suppose thus that {x m } m >\ is a Cauchy sequence in X. By sequential 
compactness we may extract a subsequence {x mk }k>\ converging to a point xo e X. 
We leave it to the reader to show that x m —*■ xq too, and thus every Cauchy sequence 
converges, so X is complete. 

To see that X is also totally bounded, assume for the contrary that it is not. Then 
there exists an sq > 0 such that 


m 


X^\J B eo (x k ) 


k= 1 


for any choice of points x\, ... ,x m e X, m > 1. We now construct the following 
sequence. Since X 0 (otherwise it is totally bounded), choose an arbitrary element 
x\ e X. Suppose that we had chosen points x \ , . . . , x m € X with the property that 
d (.Xj , Xj ) > sq for all 1 < i ^ j < m. We can now choose an arbitrary element 


m 



k = 1 
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and the condition d (x, , xj) > eq now holds for all 1 < i j < m + 1. In this 
way we obtain the infinite sequence {x m ) m >\ which, by sequential compactness, has 
a convergent subsequence, which thus must be Cauchy. But by the construction of 
{x m }m> l> no subsequence of it is Cauchy, and thus we obtain a contradiction. 

Finally, we prove that if A is complete and totally bounded, then X is compact. 
Suppose that is an open covering of X that does not admit a finite sub-covering. 
Our aim will be to construct a non-convergent Cauchy sequence and thus obtain a 
contradiction. 

Let us call a non-empty subset F C X a problematic set if it is closed and can 
not be covered by finitely many elements from T/ (so X itself is problematic). By 
Corollary 4.9 (applied to X with s = 1) there exists a problematic set F\ with 
diam(Fi) < 1. Applying the same corollary again (to F\ and with e = 1/2) there 
exists a problematic set F^ C F\ with diam( /R) < 1/2. Continuing inductively in 
this manner we obtain a sequence Fj 3 f 2 2 F 3 3 ■ ■ • D D ■ ■ ■ of non-empty 
closed sets with diam (/•’,„) < 1/m, all of which are problematic. Let x m e F m be 
an arbitrary element in F m . The sequence {x m ) m >\ is clearly Cauchy since Xk £ F m 
for all k > m and thus d (x* . x k > ) < 1/m for all k,k' > m. To see that the sequence 
can not converge assume that Xk xo for some xq e X. Recalling that every closed 
set contains its limit points we see that xo £ F m for all m > 1 (since Xk £ F m 
for all k > m). Further, since covers X there exists an element U e with 
xq e U. Since U is open there exists a S > 0 with Bg(x) C U . Let m now be large 
enough so that 1/m < 5. We then have that xo £ F m and diam(F m ) < S and thus 
F m C Bs(x 0 ) C IJ . We thus found an element in which covers F m , contrary 
to F m being a problematic set. It is thus shown that {x m } m >\ is a non-convergent 
Cauchy sequence, contradicting the completeness of X. The proof is complete. □ 

Remark 4.9 We remark again that this result expresses the topological property of 
compactness purely in terms of metric notions. 

Exercises 

Exercise 4.41 Prove that a totally bounded subset S in a metric space X is bounded. 

Exercise 4.42 Let (A, d) be a metric space and let d t : X x X — >■ R + be given by 
d t (x, y ) = d (x , y) if d(x, y) < 1 and d t (x, y) — 1 otherwise. Prove that (A, d t ) is 
a metric space and that both d, and d induce the same topology on A. 

Exercise 4.43 Continuing the preceding exercise, prove that in (A, d t ) every subset 
is bounded while the totally bounded subsets in (A, d) and (A, d t ) coincide. 

Exercise 4.44 Prove that if {x m } m >i is a Cauchy sequence in a metric space A and 
{x,n k }k> 1 is 4 convergent subsequence converging to xq, then x m —> xo- 

Exercise 4.45 Prove that a metric subspace S of R" with the Euclidean metric is 
complete if, and only if, S is closed in R" . 

Exercise 4.46 In R” with the Euclidean metric, prove that a subset is bounded if, 
and only if, it is totally bounded. Conclude that the compact subsets are precisely 
the closed and bounded ones. 
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Exercise 4.47 Prove Proposition 4.3. 

Exercise 4.48 Prove that any subset of a totally bounded metric space X is totally 
bounded and that a closed subset of a complete metric space is complete. Deduce 
that a closed subset of a compact metric space is compact. 

Exercise 4.49 Prove that the unit ball { x e R" | ||x|| < 1} is compact in R' 1 (here 
the norm and the topology are the Euclidean ones). 

Exercise 4.50 Prove that the unit ball {x e Co | ||x||oo < 1} is not compact in cq with 
the induced topology (co is the space of sequences, say of real numbers, converging 
to 0 and the norm is the i a 0 norm). 

Further Reading 

The adventurous reader is urged to delve into Frechet’s original thesis ([3]) for a 
fascinating glimpse into the mind of the father of the subject matter of this chapter. 
Of the numerous introductory level textbooks on the subject, the reader may wish to 
consult [5] for an introduction to metric spaces which is very much in line with the 
presentation as given in this chapter, or the more advanced [6] for a more compre- 
hensive introduction to analysis in general. The reader intrigued by Baire’s Theorem 
will find a historical account and numerous examples of its uses in [4] 

For a glimpse of the overwhelming ubiquity of metric spaces in mathematics see 
the Encyclopedia of Distances ( [ 1 ]). The encyclopedia presents an almost uncount- 
able list of examples of metric spaces as well as various generalizations of metric 
spaces. Some of the generalizations (e.g., [8]) are inspired by problems in physics. 
Other generalizations (e.g., [2]) give rise to a unifying perspective on the relationship 
between metric spaces and topology (see [9]). For an investigation in the opposite 
direction, i.e., metric spaces with special properties, see [7]. 
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Chapter 5 

Normed Spaces 


Abstract Normed spaces are treated at length as well as techniques of Banach 
spaces for solving differential and integral equations. Classical results, such as the 
Open Mapping Theorem, the Closed Graph Theorem, the Hahn-Banach Theorem, 
the Riesz Representation Theorem, and a few more, are given as well, establishing the 
core of the theory of bounded operators on Banach spaces. The chapter closes with 
a view towards generalizations of the theory in two directions, providing a glimpse 
of the theory of unbounded operators and of locally convex spaces. 

Keywords Normed space • Semi-normed space • Banach space • Fixed-point tech- 
niques • Dual space • Closed operator • Bounded operator • Linear functional • 
Fredholm equation • Hahn-Banach Theorem 

Normed spaces were introduced in Chap. 2. We revisit the definition here in a broader 
context and we study the topology and metric structure of normed spaces. We start 
though with the following discussion. 

From elementary geometry of the plane, namely Pythagoras’ Theorem, the length, 
or norm, of a vector x — {x\ , xj) € R 2 is given by ^x 2 + x\. Consider now two 

vectors x, y e R 2 . These vectors then determine a triangle with side lengths ||x||, 
|| _v || , and || x — y||. Let 6 be the internal triangle angle formed by x and y. From 
elementary trigonometry, namely the law of cosines (which is a generalization of 
Pythagoras’ Theorem), the lengths of the sides of the triangle and the angle 6 are 
related by the formula 

ll*-v|| 2 = ||x|| 2 + ||y|| 2 -2||x||||y||cos0. 

Using the above expression for the length of a vector, this formula becomes 

(xi - vi) 2 + (x 2 - yi) 2 = x\ + xf + y\ + y\ - 2||x||||y|| cos0, 

which simplifies to 

Xiyi +X2V2 = IWIIIy II COS0. 

Recalling the standard inner product in R 2 , given by (x, y) = xiyi +X 2 V 2 , we obtain 
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{x, y) = ||*||||y|| cos 0 

and thus we have discovered the standard inner product and the Cauchy-Schwarz 
Inequality in R 2 , by elementary geometry. 

We thus see that with a sufficiently strong geometry (i.e., the presence of angles), 
a notion of a norm gives rise to an inner product. Conversely, an inner product gives 
rise to a norm by means of ||*|| = V(*> *), and, via the Cauchy-Schwarz Inequality, 
also gives rise to angles. Quite often though, the geometry present in a linear space 
of interest is not so rich that one can speak of angles, but a notion of a norm is still 
available. These are the normed spaces, which form the topic of this chapter. 

Section 5.1 presents the definition of semi-normed, normed, and Banach spaces, 
establishes several fundamental continuity results in general normed spaces, and 
establishes the Open Mapping Theorem, a result of great importance. Section 5.2 
then is an investigation of a powerful technique for solving a problem by recasting 
it in the form of a fixed-point problem, and then resorting to Banach’s Fixed-Point 
Theorem. This technique is exemplified in detail for the solution of certain classes 
of differential and integral equations, namely to Volterra equations and to Fredholm 
equations. Section 5.3 starts off with a general study of inverse operators, motivated 
by an analysis of fixed-point problems, and then proceeds to apply this more power- 
ful technique to strengthen the results obtained in the previous section. Section 5.4 
investigates dual spaces, establishes the Riesz Representation Theorem, presents the 
duals of i p spaces, and the Hahn-Banach Theorem. The first four sections represent 
the core material of the theory of bounded linear operators between normed spaces. 
Section 5.5 is a short introduction to two fruitful directions in which the theory can 
be generalized, namely by considering unbounded operators between normed spaces 
and by introducing locally convex spaces. 


5.1 Semi-Norms, Norms, and Banach Spaces 

Banach spaces are normed spaces in which the algebra and geometry mesh very well 
together, giving rise to powerful theorems. We introduce Banach spaces in a broad 
context, starting with general semi-normed and normed spaces, a basic investigation 
of the topology of normed spaces, followed by a study of bounded linear operators 
and the Open Mapping Theorem. 


5.1.1 Semi-Norms and Norms 

The notion of a norm on a linear space is an association of lengths to vectors in such 
a way that the linear structure is taken into consideration. In practice, for various 
reasons, it is convenient to introduce a weaker structure, that of a semi-norm on a 
linear space. Every semi-normed space can canonically be turned into a normed one, 
and thus one may think of semi-norms as an intermediate step towards obtaining 


norms. 
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Definition 5.1 Let V be a linear space over K = R. or over K — C. A semi-norm 
on V is a function associating with every vector x e V a non-negative real number 
p(x), the norm of x, for which the following conditions hold. 

1. p(ax) = | ce | /? (.X ) , for all a e K and* e V. 

2. The triangle inequality holds, i ,e.,p(x + y) < p(x) + p(y), for all x, y e V . 

If, moreover, p( x) = 0 implies x = 0, for all x e V, then the function is called a 
norm on V, in which case we write ||x|| = p(x). A linear space V together with a 
semi-norm on it is called a semi-normed space, and a linear space V together with a 
norm on it is called a normed space. It is common to refer to a (semi-)normed space 
V, leaving p{x), or ||x||, implicit. 

Example 5.1 Numerous examples of normed spaces were already presented in 
Chap. 2, so we only consider here semi-norms. Firstly, and quite trivially, any linear 
space V supports the trivial semi-norm given by p(x) = 0 for all x e V. Less 
trivially, consider the linear space C(R, M) of all continuous functions x : M — > R. 
The evaluation semi-norm is given by ||x|| = |x(0)|. Indeed 

||ax|| = |ax(0)| = |or||x(0)| = |a|||x|| 


and 

II* + y II = l*(0) + y(0)| < |x (0) | + |y(0)| = ||x|| + ||y||. 

Note that it is obvious that ||x || = 0 need not imply x = 0, only that x(0) = 0, so this 
semi-norm is not a norm. This construction is a typical one giving rise to semi-norms 
by ignoring some of the information embodied in the vectors (in this case, only x(0) 
is important for determining the semi-norm, all the other values of the function are 
ignored). Naturally, one may evaluate at any point, not just at x = 0, and obtain a 
family of semi-norms. 

We mention here that there is another natural way to obtain semi-norms. The 
space C ([a, b\, R) is a space of continuous functions. If one wishes to consider non- 
continuous functions and to utilize the integral in order to obtain a norm similar in 
spirit to the L 0 0 or L p norms on C([a. b], R), then one encounters the following 
difficulty. It is well-known that for a non-continuous function x, it is possible that 
f dt |x(t)| = 0 while x ^ 0. For instance, if the function is constantly 0 except 
at finitely many points. Thus, by allowing non-continuous functions into our linear 
spaces, the insensitivity of the integral to finite changes in the integrand implies that 
norms on such spaces are rare. Semi-norms on such spaces though are available. 

We will also see below one more instance where semi-norms arise naturally from 
norms, namely when transferring a norm from a space to a quotient of it. 

We now describe the process of turning a semi-normed space into a normed one. 
This, of course, can not be achieved without paying a price. In this case all non- 
zero vectors of norm 0 must be quotiented out of existence. This is a small price to 
pay since vectors of norm 0 are, in a sense, negligible. As stated above, this is an 
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important method for constructing normed spaces, since in practice it is common to 
have a natural choice of a semi-norm on a given space, and then one passes to the 
suitable quotient as we now describe. 

Theorem 5.1 Let V be a semi-nonned space. 

1. The set U = {* e V \ p(x) — 0} is a linear subspace ofV. 

2. The function 


II* + U\\ = p{x) 

is well-defined on the quotient space V/U and with it V / U is a normed space. 
Proof 

1. In a semi-normed space we have p(0) = p( 0-0) = |0| • p( 0) = 0, and thus 
0 e U, which is thus in particular non-empty. It remains to show that U is closed 
under addition and scalar products. Indeed, if x, y e U , then 

0 < p(x + y) < p{x) + p(y) = 0 + 0 = 0, 

and thus p(x + y) = 0, so that x + y e U. Finally, if x e U and a e K is an 
arbitrary scalar, then p(ax) = \a\p(x) = |a| • 0 = 0, so that ax e U. 

2. To show that * + U i-v p(x) is well-defined, suppose that x' + IJ = x + (/. Then 
x — x' e U, namely p(x — x') — 0, and thus 

||* + 17 1| = p(x) — p((x —x') + x') < p(x — x') + p(x') — p(x') = II*' + U\\ . 

A similar argument shows that ||x' + U\\ < \\x + U ||, and it thus follows that 
||* + U || = ||*' + U\\ and thus that * + U i->- p(x) is independent of the choice 
of representative. Next, clearly, ||* + U\\ >0 for all * + U e V/U and 

||a(* + I/)|| = || or* + f/|| = p{ax) = \a\p{x) = |ar|||* + U\\ 

for all a e K and*+t/ e V / U. A similar argument proves the triangle inequality. 
It remains to show that if ||x + U j = 0, then * + U = 0 in the quotient space, 
namely that* e U. Indeed, since ||* + U\\ = /?(*), if ||* + U\\ = 0, then* e U 
by definition of U. To conclude, p(x) induces a norm on the quotient space V/U, 
as claimed. □ 

We already proved that a normed space is automatically endowed with a distance 
function. We now extend this simple, yet pivotal, observation to make a connection 
between semi-normed spaces and semimetric spaces as well. Recall that a semimetric 
space is a set together with a distance function d : X x X -* R + satisfying the axioms 
of a metric space, except that it is possible that d (* , v ) = 0 for distinct points * and y . 
For the reader’s convenience we repeat the statement and the proof for normed spaces 
as well. 
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Theorem 5.2 Let V be a semi-normed space. Then defining d (x , y) = p(x — y ), 
for all vectors x , y e V , endows V with the structure of a semimetric space. If V is a 
normed space, then d(x, y) = p(x — y) — \\x — y || endows V with the structure of a 
metric space. These are, respectively, the induced semimetric and metric structures. 

Proof Clearly d(x,y) > 0. Symmetry follows by 

d(x,y) = p(x — y) = p((-l)(y - x)) = | - l\p(y-x) =d(y,x) 
while the triangle inequality follows by 

d(x,z) — p(x — z ) = p(x — y + y-z) < p(x - y) + p(y - z) — d(x, y)+d(y, z). 
Finally, if the semi-norm is in fact a norm, then 

d(x, y) = 0 => ||x — v || = 0 => x — y — 0 => x = y, 
completing the proof. □ 

Recall that there is a canonical construction for turning a semimetric space into 
a metric space. In more detail, if ( X , d) is a semimetric space, then defining x ~ y 
precisely when d(x,y ) = 0, is an equivalence relation on X and the function 
d([x], [y]) = d (x , y) is well-defined and turns the quotient set X /~ into a met- 
ric space. The next result shows that starting with a semi-normed space, turning it 
into a metric space by first constructing the associated semimetric and then turning 
it into a metric space, or first turning it into a normed space and then associating a 
metric space with it, yields the same end-result. 

Theorem 5.3 Let V be a semi-normed space. Consider the induced normed space 
V/U = {x + U\ x€ V], where U = [x e V \ p(x) — 0}, and the induced metric 
space ( V/U , d). Consider next the induced semimetric space (V, d) and the induced 
metric space (F/~, d) where ~ is the equivalence relation on V given by x ~ y 
precisely when d(x, y) — 0. Then the two resulting metric spaces are identical. 

Proof First we show that the two resulting metric spaces have identical underlying 
sets. Indeed, for the second described metric space we consider the quotient set V /~ 
where x ~ y if, and only if, d(x, y) = 0. But d(x, y) = p(x — y) and thus x ~ y if, 
and only if, x — y e U. It follows that [x] = x + U, and thus the quotient set V/~ 
coincides with the quotient set construction for V/U. Next we show that the metrics 
agree as well. The induced metric on V/U is given by 

d(x + U,y + U)= ||x — y + U\\ = p(x - y) 
while the metric d on V/~ is given in terms of the semimetric d on V as follows: 


d{[x], [y]) = d(x, y) = p(x - y). 
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The two metrics are thus identical, as claimed. □ 

A subspace U of a semi-normed space (V, p) is automatically a semi-normed 
space by defining y \-x p(y) for every y e U, using the semi-norm in the ambient 
space. Similarly, any subspace U of a normed space ( V , || - 1| ) is a normed space with 
norm given by y i->- ||v||, for all y e U . The proofs are immediate. The situation 
for quotient spaces is slightly more complicated and we now address it. At this point 
topological considerations enter the discussion, and thus we recall that any metric 
induces a topology and thus the concepts of topology are available in any normed 
space. In particular, we may speak of closed subsets of a normed space V. 

Theorem 5.4 Let V be a normed space and U a linear subspace of V. Then the 
quotient space V / U , when endowed with the function 

p(x + U) = inf llx + m||, 
ueU 


is a semi-normed space. IfU is a closed subspace of V, then V/U is in fact a normed 
space. 

Proof The quotient vector space V/U is the set {* + U \ x e V} of translations of 
U, with the linear operations determined by the representatives. We need to verify 
that the proposed formula 


p(x + U) = inf ||jc + u 
ueU 


satisfies the conditions for being a semi-norm. Clearly, p(x + U ) > 0 since it is an 
infimum of non-negative real numbers. Given a e K and x + U e V / U we need to 
show that 

p(ax + U) = \a\p(x + U). 


If a — 0, then 


p(ax + U) — p ( 0 + U) = inf ||m|| = 0 

ueU 


(since 0 e U and ||0|| = 0), and the equality holds, [f a f 0, then 


p(ax + U) = inf \\ax + u 

U&U 


inf \\txx + a — u\\ = |a| inf ||jc -| — u 
ueU cl u&U a 


and noting that {{\ / tx)u} ue u and {u} ue u are the same set of vectors, the desired 
equality follows. To establish the triangle inequality we need to show that 

inf H* + y + u\\ < inf \\x + u'\\ + inf ||y + n"||= inf \\x + u'\\ + ||y + u"\\ . 
ueU u'eU u"eU u',u"eU 

It suffices to establish that, given u ’ , u" e U, 
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inf || A’ + y + u\\ < ||x + i/|| + ||y + u" 
ueU 


and indeed, 

inf ||x + y+w|| < \\x+y+(u'+u")\\ = || (x-H/) + (;y+M")ll < \\x+u'\\ + \\y+u"\\. 
ueU 

We thus far established that V/U, with the proposed function, is a semi-normed 
space. To complete the proof we now show that if U is closed in V, then V/U is in 
fact a normed space, namely that for ||x + U\\ = p(x + U), if ||x + U\\ = 0, then 
x + U = 0 in the quotient space, that is that x e U . So, noting another equivalent 
form for the proposed norm, suppose that 

|| jc + U\\ = inf || u — x|| = 0. 
ueU 

Now, by definition of infimum, there exists a sequence u n e U with \\u„ — x|| —*■ 0. 
In other words, d(u n , x) — > 0 and thus u n — > x in the induced topology. We thus 
exhibited x as a limit of a sequence in the closed set U, and thus x itself belongs to 
U, as required. □ 

To see that if U is not closed then the quotient space may indeed only be a semi- 
normed space, consider any normed space V and a dense linear subspace U of it. 
The quotient space V/U with the induced semi-norm 

p(x + U) = inf || jc + u\\ = inf ||u — x|| 
ueU ueU 

allows for p(x + U) = 0 even if x (/ {/.In fact, p(x + U) = 0 for all x e V, in other 
words the induced semi-norm on V/U degenerates to the trivial semi-norm. Indeed, 
given x e V, choose a sequence in U that converges to x, namely \\u n — x|| — > 0. In 
particular then, 

p(x + U) < inf || u n — x || =0. 


5.1.2 Banach Spaces 

It can be shown that any finite-dimensional normed space is complete. However, as 
we have seen, an infinite-dimensional normed linear space need not be complete. 
Non-complete normed spaces exhibit very pathological behaviour. In some sense, 
such spaces are full of holes. In the rest of this chapter we will explore some of 
the consequence of the powerful interaction between algebra and topology when a 
normed space is also a Banach space, namely a complete metric space. 

Definition 5.2 A normed space which, with the metric induced by the norm, is 
complete is called a Banach space. Banach spaces are usually denoted by 38. An 
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inner product space which, with the metric induced by the inner product, is complete 
is called a Hilbert space. Hilbert spaces are typically denoted by Jif. 

Example 5.2 We have shown that R" with the Euclidean metric is complete. Since 
the Euclidean metric is induced by the standard inner product on R" , it follows that 
R" is a Hilbert space. For all 1 < p < oo the normed space i p was shown to be 
complete with respect to the induced metric, and thus l p is a Banach space. The space 
f 2 is a Hilbert space since the £2 norm is induced by an inner product. The space 
C([a. b ], R) with the norm is complete and is thus a Banach space. However, 
the pr e-Lp spaces C([a, b], R) with the L p norm for 1 < p < 00 are not Banach 
spaces since they are not complete. 

Since any metric space has a completion, it is natural to consider the completion 
of the non-complete spaces C([a, b], R) (with various norms) in the hope that the 
algebraic structure of the space can be extended to the completion so as to obtain a 
Banach space. In fact. Exercises 4.39 and 4.40 already accomplish just that, a fact 
we record as the following result. 

Theorem 5.5 Let V be a linear space. 

1. Given an inner product on V, let V be a completion ofV with respect to the metric 
induced by the norm ( induced by the inner product). Then the inner product on 
V extends to an inner product on V, making V a Hilbert space. 

2. Given a norm on V, let V be a completion ofV with respect to the metric induced 
by the norm. Then the norm on V extends to a norm on V, making V a Banach 
space. 

Definition 5.3 Let 1 < p < 00 . A completion of C([a, b], R) with respect to the 
L p norm is denoted by L p ([a, b ], R). 

Corollary 5.1 For all 1 < p < 00 the space L p ([a. b], R) is a Banach space, and 
L 2 ([a, b], R) is a Hilbert space. 

Remark 5.1 This family of Banach spaces is of great importance in applications of 
the general theory to problems in Quantum Mechanics. It should be noted that a more 
common route to the definition of the L p spaces is via measure theory and Lebesgue 
integrable functions. We took here a short-cut (of sorts) to the definition, utilizing 
the results on metric spaces established earlier. 

Recall that a series a* of real or complex numbers is said to converge to s 

if the sequence {s m } m > 1 of partial sums, i.e., s m = Xt"=t converges to s. We 
thus see that convergence for series is defined in terms of sequences. Moreover, 
one can recover the convergence of sequences in terms of that of series, as follows. 
Given a sequence {x m } m >\ consider the series Xfcli a k where a* = Xk — Xk-\ (and 
a 1 = V| ). The partial sum s m is easily seen to be nothing but x m . and thus if the 
series converges, then so does the sequence. Some aspects of the interplay between 
sequences and series in R and in C extend to normed spaces, as we now briefly 
discuss. 
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Definition 5.4 For elements {a m } m >\ in a normed space V, we say that the series 
a k converges to s e V if the sequence of partial sums converges to s with 
respect to the norm in V, i.e., when 


m 


lim IKY**) — ^ II = 0. 

r— >o o ' 

k= 1 


We say that a series 

OO 

Y, ak 

k=l 

converges absolutely if the series of real numbers 

OO 

2>n 

k=l 


converges. 

Recall that if a series Xfct a k °f rea l (° r complex) numbers converges absolutely, 
that is if \ a k\ converges, then the original series converges as well. Viewing M 
(or C) as a normed space, this result is a special case of the following useful criterion 
for a normed space to be a Banach space. 

Theorem 5.6 A normed space V is a Banach space if, and only if, the absolute 
convergence of a series in V implies its convergence. 

The proof is left for the reader as we now turn to show that the algebraic structure in 
a normed space interacts well with the metric and topological structures induced by 
the norm. 

Proposition 5.1 The following assertions hold for any normed space V. 

1. The mapping x i-> ||x||, as a function V —>■ R, is non-expanding and thus the 
norm function is (uniformly) continuous. 

2. For any fixed xo e V, the translation mapping x xq + x is a global isometry 
and a homeomorphism. 

3. For any a e K with a 0, the scaling mapping x i — >■ ax, as a function V — > V, 
is uniformly continuous and a homeomorphism. 

4. The mapping (x, v) > x + y, as a function V x V — ► V, is continuous (and in 
fact uniformly continuous when V x V is given the additive product metric). 


Proof 

1. Referring to Proposition 2.12 we have 
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2. Since || (xo + x) — (xo + x') || = ||x — x ' || any translation is an isometry. Further, 
the translation x i->- xo + x is clearly invertible, its inverse being the translation 
x i— — xo + x. An isometry is certainly continuous, and thus any translation is a 
continuous function with continuous inverse, and thus a homeomorphism. 

3. Since ||ax —ax'|| = |a|||x— x'||,itfollowsthatany scaling mappingis uniformly 
continuous. The scaling mapping x m- ax is clearly invertible, with inverse the 
scaling mapping x i->- (l/a)x. Any non-zero scaling is thus a continuous function 
with a continuous inverse, and thus a homeomorphism. 

4. Left for the reader. □ 


5.1.3 Bounded Operators 

In the text below, linear operators will typically be denoted by A or B, and their 
values on vectors x will be given by, e.g.. Ax rather than the more cumbersome 
A(x). 

The norm ||x|| in a normed space represents the length of the vector x. Given an 
operator A between normed spaces it is thus natural to study the way the operator 
affects the norm, that is, how ||x || and || Ax || are related. If the operator does not alter 
the norm of any vector by more than some constant multiple, then it is said to be 
bounded. This class of operators is studied here, showing that for linear operators 
the concept of boundedness and of continuity coincide. The main result we establish 
is that the collection of bounded linear operators between normed spaces is itself a 
normed space, which in fact is a Banach space if the codomain is a Banach space. 

Definition 5.5 An operator A : V — > W between normed spaces is called a bounded 
operator if there exists a positive real number M such that 


II Ax || < M\\x 


for all x e V. 

Equivalently, an operator A is bounded if it maps bounded sets in the domain to 
bounded sets in the codomain (where a set S of vectors in a normed space is bounded 
if there exists a positive M such that ||,v| < M for all s e S ). 

Remark 5.2 It is not hard to show that any linear operator between finite-dimensional 
normed spaces is bounded. 

Example 5.3 In C([a, h], R), the linear space of continuous functions x : [a. b] — * 
R with the L 0 0 norm, consider the integral operator: 


b 



a 
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with K : [a, b] x [a, b] — * R a given continuous function. To see that the operator 
A is bounded note that if M = max, JS [ Q jq \K(t, s)|, then 


|| Ax || = max 
je[a,fo] 


dt K(t, s) x(t) 


U 

< max |^T(r, j)| dt x(t) 
t,se[a,b] J 


< 


<M(b — a) max \x{t)\ = M(b — ct)\\x\\. 
/e[u,£>] 


The following example shows that a very familiar linear operator is not bounded. 

Example 5.4 Consider the operator A : (^([O, 1], R) — »■ C([0, 1], R), where both 
the domain and the codomain are given the L 0 Q norm, which maps any continuously 
differentiable function x : [0, 1] —> R to its derivative, i.e.. Ax = dx/dt. The fact 
that A is not bounded is seen for example by considering x n (t ) = f" , and noting that 
||x„ || = 1 but Ax n {t) — nt n ~ l , and thus || Ax„|| = n. 

Remark 5.3 The failure of the differentiation operator to be bounded is, in a sense, a 
reflection (or the cause, depending on one’s point-of-view) of the complicated nature 
of its inverse operator, namely the non-trivial nature of integration. 

As mentioned above, it is the case for linear operators that being continuous and 
being bounded are equivalent conditions. In fact, we now show that the interplay 
between the linear structure and the topology in a normed space implies a stronger 
unification of concepts. 

Theorem 5.7 The following conditions for a linear operator A : V — > W between 
normed spaces are equivalent. 

1. A is uniformly continuous. 

2. A is continuous. 

3. A is continuous at x = 0. 

4. A is bounded. 


Proof Obviously, uniform continuity implies continuity, and continuity implies con- 
tinuity at x = 0. Suppose now that A is continuous at x = 0. Then there exists a 
S > 0 such that ||Ax|| < 1 for all x e V with ||x|| < 28. Now, given an arbitrary 
ye V, if y = 0, then Ay = 0 while if y ^ 0, then, noting that ||<5y/||y|| || = 8 < 28, 
one obtains that 


II -Ay II = IIA 


ail.vll.y 

illy II 



< 



showing that A is bounded by 1/S. Finally, if A is bounded by some M > 0, then 
|| Ax — Ax'll = ||A(x — x')|| < M ||x — x'||, for all x,x' e V, and thus uniform 
continuity follows easily. □ 
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For a bounded operator A, the inequality ||Ax|| < M||x|| sets an upper bound on 
how much A can lengthen vectors in the domain. It is natural to consider the infimum 
over all such upper bounds M and define that to be the norm of the operator A. 

Definition 5.6 Let A : V — > W be a linear operator. The operator norm or simply 
the norm of A is defined to be 

|| A || = inf{M > 0 | Vx e V ||Ax|| < M||x||}, 

adopting the convention that inf 0 = oo. 

Thus, a linear operator is bounded if, and only if, its norm is finite. 

The homogeneity of the norm with respect to the scalar product, and continuity 
considerations, give rise to the following ways of computing the operator norm. 

Proposition 5.2 For a bounded linear operator A : V —>■ W between normed 
spaces, the following expressions are all equal to || A||. 

1. sup{ || Ax || | x e V, ||x || < 1}. 

2. sup{ || Ax || | x e V, ||x|| < 1}. 

3. sup{ || Ax || | x e V, ||x|| = 1}. 

4. sup{ || Ax || /||x || | x e V, x ^ 0}. 

The proof is left for the reader. 

In general, any bounded operator, simply since it is continuous, is determined by 
its values on a dense subset. However, not any bounded operator on a dense subset 
can be extended to the full domain. The next result shows that when the codomain 
is a Banach space this difficulty disappears. 

Theorem 5.8 Let V be a normed space, ftf C V a dense linear subspace of V, 
and 3S a Banach space. If A : IS — >■ IIS is a bounded linear operator, then there 
exists a unique linear operator B : V — > SS such that Bx = Ax for all x e IS and 
|| _B|| = || A || . Such a B is called an extension of A. 

Proof Given xq e V let {x m } m >i be a sequence in with x m -* xo, and thus in 
particular, {x m } m >i is Cauchy. Recall that a bounded linear operator is in particular 
a uniformly continuous function, and that uniformly continuous functions preserve 
Cauchy sequences. Hence A preserves Cauchy sequences, and thus the sequence 
{Ax m } m >i in US is Cauchy. Since SS is complete there exists an element vo e SS with 
Ax m — »■ yo- Suppose now that [x' m } m >i is another sequence in fd with x’ m -* xq, 
and suppose Ax' m — > yf. Utilizing the continuity of the norm, it follows that 

II yo - Toll = II lim ( Ax m - Ax' m )\\ = lim ||Ax m - Ax'J < 

m— ► oo m— >oo 

< || A|| || lim {x m -x' m )\\ = || A || ||x — x || = 0 

m— >oc 

and thus yo = y'o- Setting Bx o = yo is thus well-defined and we obtain a function 
fi : V SS. Obviously, for x e ^ we may choose x m — x and then 
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Bx = lim Ax m — Ax 

m—>o o 


showing that B extends A. 

That B is linear is seen as follows. If x, y e V and x m — » x and y m — > y as 
above, then it follows that x m + y m — >■ x + y and thus 

B(x + v) — lim A(x m + y m ) = lim Ax m + lim Ay m = Bx + By. 

ra— >-00 m— HX) m— >oo 


A similar argument shows that B{ax) = a(Bx). As for showing that ||A|| = || B || , 
note first that || A || < | B || just by the fact that B is an extension of A . Next, for x e V 
and x m — > xo as above, one has 


II Ax m || ^ II A || \\x m || 

and passing to the limit, and remembering that the norm is continuous, we obtain 

|| Z? x || < || A || || jc || 

showing that ||B|| < ||A||, and thus the desired equality holds. Finally, if C is any 
bounded linear extension of A, then it is continuous and agrees with A on a dense 
subset. But any two continuous mappings that agree on a dense subset are equal, 
hence B = C, showing that B is the unique linear bounded extension of A to all 
of V. □ 


5.1.4 The Open Mapping Theorem 

A continuous function between topological spaces need not be open (i.e., it need 
not send open subsets to open subsets). We now show that surjectivity is a sufficient 
condition for a bounded linear operator between Banach spaces to be open. The 
result is due to Stefan Banach and is our first example of a deep result in the theory 
of Banach spaces, one with profound consequences. 

The technical heart of the argument is stated as a separate lemma. Its proof is 
omitted in favour of a more conceptual presentation of the the main result. We refer 
the reader to any text dedicated to Banach spaces in order to fill-in the details and 
obtain a fully rigorous proof. 

It will be convenient to introduce the notation Oy(.y) = {x e V \ ||x|| < s}, the 
open ball in a normed space V with centre the zero vector 0 and radius s > 0. We 
further write n ■ 0y (s) for the point-wise scaling of 0y(s) by n, namely 


n ■ 0y(s) = {nx \ x G 0y(^)} = 0y(ns). 
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Lemma 5.1 Let A : £$ —>■ V be a bounded linear operator from a Banach space 
SS to a normed space V. It then follows that ifr , s > 0 are such that 


OyO) c A (0^ (/■)), 


then 

0v(j)CA(0 a fr)). 

Theorem 5.9 (Open Mapping Theorem) A bounded surjective linear operator 
A : between Banach spaces is an open mapping. 

Proof Let U C 3B\ be an open set, which we may assume is non-empty. For some 
xo e U . let us consider Axo, an arbitrary element in A(U). As U is open, let e > 0 
be such that B E (x o) C U, and notice that B E (x o) = 0t% 1 (e) + xo, i.e., a ball with 
centre xo is the xq translation of a ball centered at the origin. By linearity of A we 
then have 

A(U) 2) A(xo + 0^ (e)) = Axo + A( 0^ (e)). 

Thus the proof is reduced to showing that for every e > 0 there exists a 8 > 0 such 
that 0@ 2 (S) C A(0^ l (s)), since then 

A(IJ) 25 Axo + 0^ 2 (5) = B$(Ax o) 

and thus Axo is an interior point of A(U). As Axo was arbitrary, every point of A(U) 
is interior, and thus A(U) is open. 

We thus fix an open ball Oggfe), e > 0, denote X — A(0^ (e)), and we seek out 
a 8 > 0 for which 0 gg 2 {8) c X. Noting that 

= 1J n ' °^i 

n> 1 


the surjectivity and homogeneity of A implies that 

@2 = A{3§ i) = IJ n ■ X. 

n> 1 


Taking closures and applying Baire’s Theorem, there exists an n > 1 such that n ■ X 
contains an open ball. In other words, there exists a 8 > 0 such that Bg(y ) C n ■ X. 
We omit some details on translations and scaling which are used to conclude that in 
fact from If (y) C n • X it follows that 0gg 2 (8/n) C X. Lemma 5.1 then concludes 
the proof. □ 

The next result is an immediate corollary of the Open Mapping Theorem, illustrating 
some of the taming effect of the completeness requirement on Banach spaces. 
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Corollary 5.2 If a bounded linear operator A : 38 \ — » ItSi between banach spaces 
is invertible, then its inverse A -1 is also a bounded linear operator. 

Proof By Proposition 2.8, the inverse of a linear operator is itself linear. It remains 
to show that A -1 is continuous, namely that (A _1 ) _1 (t/) is an open set whenever 
U is open. Since (A -1 ) -1 = A, we need to show that A(U) is open whenever U is 
open, namely that A is an open mapping. Indeed, since A is invertible it is necessarily 
surjective, and thus A is an open mapping by the Open Mapping Theorem. □ 


5.1.5 Banach Spaces of Linear and Bounded Operators 

We end this section by studying the set B(V, W) of all bounded linear operators 
A : V — »■ W between normed spaces. The main result is that when endowed with the 
operator norm one obtains a normed space, which, furthermore, is a Banach space 
provided the codomain is a Banach space. 

Theorem 5.10 Let A , B : V — > W be bounded linear operators between normed 
spaces. Then 

1. || A || = 0 if and only if A = 0. 

2. Her A || = |o!||| A || /or all scalars a. 

3 . II A + fi|| < || A || + ||fi||. 

Proof 

1. If A = 0 then || Ax|| < 0 ■ ||x|| and thus || A|| = 0. Conversely, if || A|| = 0, then 
|| Ax || < || A || || jc || = 0 and thus Ax = 0, for all x e V. 

2. ||or A || = supn^ii-j ||ce Ax || = |a| sup|| A .|| =1 ||Ax|| = |a|||A||. 

3. || A + fi|| = sup| W | =1 || (A + fi)x|| < sup w=1 (||Ax|| + || fix || ) = ||A|| + ||fi||. 

□ 

Recall Theorem 2.3, stating that for linear spaces V and W the set Hom( V, W) of 
all linear operators A : V — > W is a linear space. 

Theorem 5.11 For normed spaces V and W, the set B( V, W ) is a linear subspace 
of Hom(P, W ) which, when endowed with the operator norm, is a normed space. 

Proof Theorem 5.10 states that B(V, W) contains 0 and is closed under addition 
and scalar multiplication, and is thus a linear subspace of Hom( V, W). Further, the 
same theorem establishes the axioms of a normed space for the operator norm on 
B(V, W), and the claim follows. □ 

Since B( V, W) is a normed space, it is also a metric space and thus a corresponding 
notion of convergence for operators is automatic. 
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Definition 5.7 (Operator Norm Convergence or Strong Convergence) Let V and 
W be normed spaces. Given a sequence {A„,} m > i of operators in B(V, W), we say 
that {A, „} m >i converges strongly or converges in the operator norm to an operator 
Ao e B(F, W) if \\A m — Ao || —*■ 0 (equivalently, when A m — »• Ao with respect to 
the metric induced by the operator norm). 

A weaker notion of convergence is also available in the space B( V, W). 

Definition 5.8 Let V and W be normed spaces. We say that a sequence { A m } m >i of 
operators in the space B(V, W) converges point-wise to an operator Ao € B(V, W) 
if A m x — > Aqx for all x e V. 

Since \\A m x — Aqx|| < \\A m — Ao || II II it follows that strong convergence implies 
point-wise convergence. That the converse may fail is illustrated next. 

Example 5.5 Consider the sequence of linear operators F m : — * C (linear op- 
erators whose codomain is the ground field are called linear functionals and are 
studied in detail below), given by F m x = x m , that is F m is the projection of the m-th 
coordinate, and thus is clearly a linear operator. Since 

\\F m x\\ = \x m \< ||*|| 

we see that \\F m \\ < 1 and thus F m is bounded (and in fact || F m || = 1, since there 
clearly exist x e £2 with || F m x\\ — ||x||). 

We claim that F m -> 0 point-wise, i.e., that F m x —*■ 0 for all x e £ 2 - Indeed, for 
x e £2 one has 

OO 

y. \Xm \ 2 < OO 

m= 1 


and consequently, x m — »■ 0. But x m = F m x, and the claim follows. However, as 
noted, ||F m || = 1 and thus { />,},„> 1 does not converge strongly to 0, or to any other 
vector. 

Finally, we prove the completeness of B(F, M) when the codomain is complete. 

Theorem 5.12 Let be a Banach space and V an arbitrary normed space. Then 
the linear space B(F, 3§), endowed with the operator norm, is a Banach space. 

Proof We already know that with the operator norm B(F, 2$) is a normed space, 
so it remains to show that it is complete. Consider a Cauchy sequence {A m } m >i. 
Since || A*jc — A m x\\ < || A& — A m || ||*|| — >-0forallxe V, it follows at once that 
{A,„x} m >i is also a Cauchy sequence, and, as 28 is complete, A m x -» y. We thus 
define Ao(x) = y and obtain the function Ao : V — ► 38 which we proceed to show 
is a bounded linear operator. 

The linearity of Ao follows by standard limit arguments. For instance, continuity 
of vector addition implies that 

Ao(x+x , )= lim A m (x + x , )= lim A m x + lim A m x' — Ax + Ax' , 

m— >00 m— >-00 m-^-oo 
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and thus ,4 9 is additive. A similar argument shows Ao is homogenous. To show that 
/Id is bounded, we use the continuity of the norm to obtain that 

|| A* || = || lim A m x || = lim ||A m x|| < lim ||A m ||||x|| < M||x|| 

m— »oo m — >00 m—>oo 

where M is any upper bound of the sequence \\A m || (which exists since any Cauchy 
sequence is bounded). 

Lastly, we verify that A m -» Ao in the operator norm. Given e > 0, using the 
Cauchy condition, there exists an A eNsuchthat || A m — A* || < e for all m, k > N. 
In particular, for all x e V, 

II (A m - A k )x || < || A m - A* || || x || < e||x||, 

for all m, k > N. When k tends to 00 we thus obtain 

II (Am - A 0 )x|| <£||X|| 

for all x e V and thus 

II A m Ao || ^ £ 

for all m > N. In other words, \\A m — Ao|| — > 0, as required. □ 

Exercises 

Exercise 5.1 Consider the Euclidean spaces R" with the standard inner product, and 
its associated norm. Prove that any linear operator A : R" — > R m is bounded. 

Exercise 5.2 Prove Proposition 5.2. 

Exercise 5.3 Prove Theorem 5.6. 

Exercise 5.4 Let U be a subspace of a normed space V and let A : U — > IT be a 
linear operator to a normed space W. Suppose that B : V —> W is a linear operator 
which extends A. Show that ||A|| < ||5||. 

Exercise 5.5 Show that point-wise and strong convergence in B(R", R m ) coincide. 

Exercise 5.6 Prove that every infinite dimensional normed space V either over the 
field K = R or the field K — C admits a discontinuous linear operator A : V —> K. 
(Hint: Use a Hamel basis.) 

Exercise 5.7 Show that Lemma 5.1 may fail if the domain of the linear operator is 
not a Banach space. 

Exercise 5.8 Let {A m } m >i be a sequence of operators A m : V -* W between 
normed spaces. Prove that if {A /n } m >i is Cauchy, then {||A m ||},„>i is Cauchy. Does 
the converse implication hold as well? 
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Exercise 5.9 Let A : V —> W be a linear operator between normed spaces. Prove 
that if A is open, then A is surjective. 

Exercise 5.10 Give an example of a normed space V and a proper dense linear 
subspace of it. 


5.2 Fixed-Point Techniques in Banach Spaces 

Since a Banach space 38 is in particular a complete and non-empty metric space, the 
Banach Fixed-Point Theorem (Theorem 4.8) guarantees that any contraction on 38 
admits a unique fixed-point. In more detail, a contraction on a Banach space 38 is a 
function (not necessarily linear) F : 38 — »■ 38 such that there exists 0 < a < 1 with 


F x — Fy || < os || jr — y|| . 


The guaranteed unique fixed-point for F is an element xo e 38 with Fix o) = xo ■ 
From the proof of the Banach Fixed-Point Theorem, the fixed-point xq is obtained 
as the limit Xk — »■ xo where Xjt+i = F(x*), and xi e 38 is an arbitrary element. 

It follows that any problem in a Banach space 38 whose solution can be expressed 
as a fixed-point for a contraction on 38 can be solved iteratively. We explore this 
technique in three different scenarios, namely for solving systems of linear equations, 
for solving first order differential equations, and for solving integral equations. 
Remark 5.4 Notice that the more general problem 


x 0 = F(x o) + b 


is subsumed by the technique described above since for F b (x ) = Fix) + b one has 
that 


\\F b (x)-F b (y)\\ = ||F(x)-F(y)|| 


and thus Fisa contraction if, and only if, F/, is a contraction. Moreover, a fixed-point 
for Fb is precisely a solution of the equation xo = F(x o) + b. It is customary to 
simplify the notation and write x = Fx + b. Notice that if F is a linear operator, 
then F is a contraction if, and only if, || F|| < 1. 


5.2.1 Systems of Linear Equations 

Consider a linear system of n equations in n unknowns: 


n 
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In terms of operators this system of equations can be written as Ax — b, where A is 
the corresponding n x n matrix of coefficients viewed as a linear operator R" — > R" , 
with x — (xi, , x„) representing the unknown vector, and b — (b\, bi, . . . , b n ) 
the vector of free coefficients. Theoretically precise methods for solving such sys- 
tems of linear equations are computationally demanding. For instance, Cramer’s 
rule requires the computation of n 2 determinants, and thus, for large n, this becomes 
unfeasible. Moreover, numerical difficulties resulting from rounding may render 
some computation techniques unreliable. We now investigate the applicability of an 
iterative fixed-point technique in this case. Such a solution is often more robust than 
naive applications of theoretical exact solutions (though we will not delve into a 
numerical analysis of the solution here). 

Since Ax = b if, and only if, x = (—A + l n )x + b, where /„ is the n x n identity 
matrix, we are led to consider the equation 


x = Fx + b 

where F = —A + I n . The original problem is equivalent to the fixed-point problem 

x = F/,x 

where F\,x = Fx + b = (— A + I n )x + b is viewed as an operator Fb : R" — > R". 
Let us assume some norm on R" is chosen. For the Banach Fixed-Point Theorem to 
be applicable, Fb must be a contraction, and since F is a linear operator we must 
examine the norm || F||. In particular, if || F\\ < 1, then F, and thus F/,, is a contraction 
and the original problem is amenable to the iterative solution x^+i = Fbx k with an 
arbitrary initial value x\ e R". 

Whether F is a contraction is typically strongly dependent on the chosen norm 
for the ambient space R" . We consider two cases below, recalling that the Kronecker 
8 function is given by 

1 if k = i, 


Example 5.6 Consider R" with the norm: ||jc|| = maxi<£<„ \xk\. Then, for any 
n x n matrix C 


\Cx - Cy\\ = max V c k j (x, - y t ) 

1 <k<n I 

1 = 1 


< max > \c k j\ \x t 
1 <*<» ' 

i=l 


< max 

l<k<n L 


i=l 


max | Xj — Xj | = max 

<i<n 1 <k<n 


[2j c Lil I^i - y< l] 

n 

\\x-y\\. 


i=i 
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Thus, if the inequality 

n 

1=1 

holds, then the operator C is a contraction. Returning to the system Ax = b and 
the related operator FbX = (— A + I n )x + b, we see that b), is guaranteed to be a 
contraction if 

n 

maX \ ~ a k,i + s k,i I < !. 

1 <k<n z — ' 

1=1 

in which case the original system can be solved iteratively. 

Example 5.7 Let us now consider R" with the I 2 norm ||jc|| 2 = Xi-=i x k’ giving 
rise to the Euclidean metric. For an n x n matrix C viewed as an operator on R” , we 
have 

n 

\\Cx\\ 2 = J^(c k ■ x) 2 
k= 1 


where c k is the A:-th row of the matrix C, and c k ■ x is the standard inner product in 
R" . Applying the Cauchy-Schwarz Inequality we obtain 


\Cx\\ 2 < ^ ||Ci;|| 2 ||x || 2 


k =l 


and thus we see that 


l|C|| < 




11 — - 


z 


k,i ■ 


\| k= 1 \| k,i = 1 

Going back to the linear system Ax = b, we conclude that it is iteratively solvable if 


'y', (—a k ,i + &k,i) 2 < 1- 

k,i = 1 


5.2.2 Cauchy’s Problem and the Volterra Equation 

Consider Cauchy’s problem for the first order differential equation: 
dx 

~r(t) = f[t,x(t)] with x(t 0 ) = xq. 
at 


The associated integral equation 
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x(t) = / ds f[s, x(,?)] + xq 


is called a Volterra integral equation. Suppose that, for \t — to I < £ and \x — xo\ < q, 
the function f(t, x) is continuous and \f(t, x) < M. Noting that a solution of the 
Volterra equation is also a solution to the original Cauchy problem, we introduce the 
operator 

t 

(Bx)(t) = j ds f[s, x(tf)] + xo 
to 


since its fixed-points are precisely solutions to the Volterra equation. If B can be seen 
to be an operator 

B : C(7,R) -* C(7, R) 


for a suitable interval 7 C [a,b\, and if B is moreover a contraction with respect 
to some norm, then locally Cauchy’s problem has a unique solution, which, starting 
with an arbitrary xq, may be computed iteratively: 


* 1 ( 0 = / ds f(s, xq) + x 0 , xjit) = / ds f[s, *i(s)] +x 0 , 


x„(t) 


Since the norm we choose must turn the space C(7, R) into a Banach space (since 
otherwise Banach’s Fixed-Point Theorem does not apply) the only norm of those 
encountered so far that we can consider is the L ^ norm (and now the importance 
of the L p spaces becomes apparent as they are complete and thus they allow one to 
consider other norms). 

Let us examine under which assumptions is it guaranteed that B is a contraction 
for a suitable interval. Consider C(7, M) where / = [to — e, to + e] and choose some 
interval / = [xo — t], xq + q] contained in 7. We seek to identify conditions on e, 
t], and M that will assure the existence of a solution, at least locally. We note the 
following. 

1. The function Bx is certainly a continuous function, given the assumptions on 

]. 

2 . 


r 


Bx — jcoll = max 
tel 


J ds f[s,x(s)] 

t o 


<max|f — fo| max \f(t,x)\<eM. 
tel tel , xeJ 
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We thus see that the operator B transforms the restriction of x to J into a function 
taking values in J provided that e < q / M , a condition we now assume is met. 

3. For all x, y e J we have that 


| Bx — .By || < max 
tel 


I 

J ds ]/[s,x(s)] - f[s, y0)]|. 


4. Suppose the function f(t, x) is not just continuous and bounded, but also satisfies 
the Lipschitz condition: 

| f(t,x) - f(t,y ) | < K\x — y\, V x,y e J, 

uniformly for all t e I . Then 


t t 

*0) - y(s)| 

to to 

< Ks max Ix(^) — y(i)| = Ke\\x — v||, V x, y e /. 

sel 

Collecting these observations together we conclude that if e < rj/M and s < 1 / K, 
then the operator B is a contraction and Cauchy’s problem has one, and only one, 
solution inside the reduced interval I = [to — e, to + e], with e < min(?7/M, 1 / K). 
Furthermore, the solution can be obtained iteratively as a fixed-point solution of the 
associated Volterra equation. 


II Bx - By || < max J ds |/[j,x(s)] - f[s, y(s)]| 


tel 


5.2.3 Fredholm Equations 

The Volterra equations, considered above, are examples of integral equations with 
variable limits. We now turn to consider Fredholm equations , which belong to the 
family of integral equations with fixed limits. 

First consider the general form of a non-linear Fredhom equation 

b 

x(t) = X J ds /T[ t, s, x(i)] = [A.r](f), 

a 

where we consider A as an operator, and we seek to identify conditions under which 
the iterative fixed-point method is applicable. Let K (t , s, x) be a continuous function 
in [a, b ] x [a, b] x [—q, q] and bounded by M > 0. Let J = [—q, q] and consider 
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C ( J . R) with the L m norm. Noticing that 

b 


|| Ax || < max 
ts[a,fo] 


X J ds K[ t, s, x(,y)] 


< \X\M(b — a ) 


we see that if |k| < r]/\M{b — a)], then ||Ax|| < r? and so the operator A is 
well-defined when restricting to the interval J . Furthermore, 


|| Ax — Ay || = max 

fs[a,£>] 


O 

X J ds jA'ff, s, x(j)] — K[t, s, y(s)]J 


u 

< |X| ma ds K [ t, s, x(i)] — K [ t, s, 


If we also make the assumption that the function K ( t , ,v, x) is Lipschitz with respect 
to x, uniformly in the square [a, b] x [a , b ] , with Lipschitz constant K, then: 

|| Ax — Av || < |L|(Z? — a)K max |x(i) — y(j)| < |L|(/r — a)K ||x — v||. 

JS[a,6] 


It thus follows that if |k|(£> — a)K < 1, then the operator A is a contraction, and 
thus the original non-linear Fredholm equation has a unique solution, obtained by 
the iterative scheme 

b 

•^«+i(0 = J ds K\t,s,x n (s)], 

a 


with xo(t) an arbitrary continuous function on [a, b ], satisfying max, |xo(t)| < rj. 
The condition 

\X\(b — a) < min{? 7 /M, 1 / K} 

guarantees the applicability of the iterative solution. 

Consider now the linear Fredholm equation: 

b 

x(t) = X J ds K(t, s)x(s) + / (0, 

a 

with X a real parameter, /(f) a continuous function in [a, b] and K (t , s) continuous 
in the square [a, b] x [a, b] and bounded by M > 0. Still considering C([a , b], K) 
with the Lqo norm, consider the operator 
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(Bx)(t) = X / ds K(t, s)x(s) + f(t). 


We now have that. 


b 


Bx — Z?v|| = max 
f€[a,fc] 



/sT(f,s)[x(^) - y(s)] 


a 

<\X\M{b — a ) max |x(s) — y(i)| = \X\M(b — a)||x — y||, 
se[a,£>] 


and thus, for 7, < l/[M(b — a)], the operator B is a contraction and the uniform 
limit (i.e., with respect to the norm) of the iterative process 

b 

x n+ \ (t) = X j ds K(t, s)x„ (■?) + /(£) = . . . 

a 

exists and gives the solution to the given Fredholm equation. A possible choice of 
the initial function is given by xo(t) = f(t). 

Example 5.8 Consider in C([0, 1], M) the Fredholm equation: 


x{t) = X ds tsx(s) + at. 


with a a real parameter. Since max, ii€ [o, i] K (t, s) = 1 and ( b — a ) = 1, it follows 
that for 7, < 1 /[M(h — a)] = 1 the operator is a contraction and one can solve the 
problem by successive approximations: 


xq = a t — ao t, x\ = X l ds (t s) (ao s) + a t = {a + ao A./3) t = a\ t, . . . 


x„+i — X / ds (t s) (a„ s) = (a + a„ X/3) t — a n +\ t , . 


For |A.| < 1 the limit x = limn^ooXn exists and thus also /3 = lim„_ i . 00 Q'„ exists. 
When passing to the limit in the last equality we obtain that a + 1 6 X/3 = ft. from 
which we conclude that /l = 3a/(3 — X) and, finally, that x(f) = 3a/(3 — X)t. 

We got this solution by assuming |A.| < 1, but one can easily see that the limit of a n 
as n — > oo also exists if |A.| < 3, as suggested by the last relation. In fact: 
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a n - (-1 = ol + a„ A./3 = ■■■ = a 


[l + A/3 4- (X/3) 2 H h C^/3) ,1+1 ] 


a n (t) — »■ 3ot/(3 — X) for |k| < 3. 


The condition |A.| < 1 is a sufficient condition for the convergence of the iterative 
process which, as just seen, has a broader domain of convergence. More generally, 
the function x(t) found above is actually a solution of the original problem for every 
X ^ 3, as is easily checked with a test function x(t) = yt, with y unknown. 


Exercises 


Exercise 5.11 Show that a contraction on an arbitrary normed space need not have 
a fixed-point. 

Exercise 5.12 Give an example of an isometry F : R — > R without any fixed-points 
(here R is given the Euclidean metric structure). 

Exercise 5.13 A function / : X — > Y between metric spaces is a Lipschitz function 
if there exists a number K e R, called a Lipschitz constant such that 




for all x, x ’ e X. Show that, with respect to the Euclidean metric on R, a function 
/ : R — > R with a bounded derivative is Lipschitz. 

Exercise 5.14 Prove that a Lipschitz function f : X Y between metric spaces 
is uniformly continuous, but the converse may fail. 

Exercise 5.15 Give an example of a function F : X — »■ Y between metric spaces 
which satisfies d(Fx, Fy ) < d(x, y ) for all x, y e X, but is not a contraction. 

Exercise 5.16 Perform a similar analysis as done in Example 5.7 but consider the 
l\ norm on R" and attempt to obtain a reasonable criterion for the applicability of 
the fixed-point technique for solving a system of linear equations. 

Exercise 5.17 Referring to Example 5.4, prove, once more, that the differentiation 
operator d/dt on C([a, b], R) is unbounded by considering the sequence 


x„(t) = sin [mr(b — t)/(b — a)]. 


Exercise 5.18 With successive substitutions, solve the Volterra equation: 



o 
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Exercise 5.19 Following Example 5.8, solve the equation: 


5 1 f 

x(t) = -t H — / ds ts x(s). 

6 2 J 

o 


Exercise 5.20 Following Example 5.8, solve the equation: 

1 


x(t) = e' — 1 + J ds t x(s). 


5.3 Inverse Operators 

From Chap. 2 we know that if a linear operator A : V -> W admits an inverse 
function A -1 : W — »■ V, then that function is automatically a linear operator. 
The Open Mapping Theorem implies that if the linear spaces are Banach spaces 
and the operator is bounded, then its inverse is also bounded. In this section we will 
establish a condition on A that guarantees the existence of a bounded inverse without 
assuming the spaces are Banach. We will them obtain a simple yet powerful result 
which presents the inverse of an operator in terms of a convergent series of operators. 
The analysis will reveal the close connection between invertibility and fixed-point 
solutions. We will then have a second, deeper, look at the Volterra and Fredholm 
equations in light of the results below. 


5.3.1 Existence of Bounded Inverses 

We present conditions that guarantee the existence of inverse operators, providing 
estimates for their norm, and, under suitable conditions, we obtain the inverse as the 
sum of a strongly convergent series. 

Theorem 5.13 Let A : V -> W be a bounded surjective linear operator between 
normed spaces. Then A is invertible with bounded inverse A -1 if and only if, there 
exists a constant C > 0 such that || Ax || > C\\x\\, for all x e V . Moreover, with such 
a constant C the operator norm of the inverse satisfies || A -1 1| < 1/C. 

Proof Suppose that || Ajc || > C||x|| holds. Then if Ax = 0, then 0 = || Ax|| > C||x||, 
and thus x = 0. In other words, Ker(A) = {0}, and thus A is injective. Since it is 
given that A is surjective, we conclude that A is invertible and that its inverse A -1 
is automatically a linear operator. Moreover, for all y e W let x e V with Ax = y, 
and thus x = A -1 y. We then have 
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||y|| = ||Ax||>C||x||=C||A- 1 y||, 

showing that ||A -1 || < 1/C. In the other direction, suppose that A -1 exists and is 
bounded. Then, for all x e V 

||;c|| = ||A _1 Ax|| < || A - 1 1 | || Ajc || . 


Noticing that the norm of an invertible operator is strictly positive, we conclude that 

1 

II Ax|| > j ||*|| 


and thus we may take C = 1/|| A 1 1| . □ 

Consider now an in-homogenous fixed-point problem * = Ax + b, where A : 38 — » 
38 is a linear operator on a Banach space 38. As we know, if A is a contraction, 
namely if II A || < 1, then a unique solution exists. Notice though that the problem 
may be re-written as (7 — A)* = b , where we write 7 for the identity operator on 38. 
Now, if (7 — A) is invertible, then we obtain the unique solution * = (I — A)~ l b. 
Since the latter does not rely on the norm it actually holds in any linear space. We 
thus have two criteria for the solution of the equation * = Ax + b, one is that A be 
a contraction and the other is that I — A be invertible. The next result shows that the 
former implies the latter, when the ambient space is a Banach space. 

Proposition 5.3 Let A : 38 — ► 38 be a linear operator on a Banach space 38. If 
|| A || < 1, then I — A is invertible. Moreover, ||(7 — A) _1 || < 1/(1 — ||A||). 

Proof Since ||A*|| < ||A||||*||, if follows that 

||(7 - A)* || = ||* - A* || > ||*|| - || A* || > ||*|| - ||A|| ||*|| = (1 - || A || ) ||* || , 

and the claim would follow from Theorem 5 . 1 3 if we can show that 7 — A is surjective. 
In other words, given y e IT, we wish to solve the equation 


(7 — A)* = y, 


or, equivalently, the equation 

* = Ax + y- 

But this is precisely an in-homogenous fixed-point problem, and since A is assumed 
to be a contraction, a solution exists. □ 

We thus see that the invertibility of 7 — A is important for the solution of fixed-point 
problems, and thus, as we saw above, to the solution of systems of linear equations, 
differential equations, and integral equations (and other problems as well which we 
did not mention). While the condition || A|| < 1 gives a sufficient condition for the 
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existence of (I — A) 1 , it does not provide us with any feasible means of obtaining 
the inverse. The following result remedies the situation. 

Definition 5.9 Let A : V — > V be a linear operator on a normed space V. The series 
A k , where A 0 is interpreted as the identity operator, is called the Neumann 
series of A. 

Recall the notation B(V, W ) for the set of all bounded linear operators A : 
V —*■ W. When V = W we will write B( V ) instead of K(V. W). Similarly, for a 
Banach space 3S, we write B (38) instead of B(0S, M). 

Theorem 5.14 Let A : V — >■ V be a linear operator on a normed space V. If the 
Neumann series for A converges in the operator norm, then the Neumann series is 
the inverse of I — A. 

Proof Suppose that the Neumann series converges to the operator S, that is 

m 

S= lim V A k 

m — >oo ' 

k = 0 


with respect to the operator norm in the space B( V). Now, since 

m 

(/ — A) y A k = I — A m+1 , 

*= o 


when m tends to oo we obtain 


OO 

(/ - A) y A k = I - lim A m+1 . 
k=0 

We leave it as an exercise to the reader to verify that if the Neumann series of A 
converges, then A'” — > 0 in the operator norm, and thus we see that the Neumann 
series is a right inverse of I — A. A similar agument shows it is also a left inverse, 
and the proof is complete. □ 

The expected sufficient condition on an operator A on a Banach space guaranteeing 
the convergence of the Neumann series is given in the next result. 

Corollary 5.3 Let A : 3$ — > 3$ be a linear operator on a Banach space SS with 
|| A || < 1. Then the Neumann series of A converges, and in particular 

oo 

(/ - a ) -1 = y A k , 

k = 0 


where the series converges in the operator norm. 
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Proof All we have to do is establish the convergence in \\(.%) of the Neumann series. 
Since S3 is a Banach space, so is the space B(^). Using Theorem 5.6, it suffices 
to show the convergence of the series II ^ II- But II A* || < || A ||* and thus the 

latter series is majorized by ll^ll > which converges since 0 < || A|| < 1. □ 

Corollary 5.4 If A is as above and X eM is such that ||A.A|| = |A.|||A|| < 1, then 
I — XA is invertible and its inverse is given by the Neumann series 

oo 

(I -XA)~ l = ^X k A k . 
k = 0 


5.3.2 Fixed-Point Techniques Revisited 

The fixed-point techniques given in Sect. 5.2 relied on point-wise convergence. The 
convergence of the Neumann series though is strong convergence (i.e., in the operator 
norm), a form of convergence much stronger than point-wise convergence. We thus 
expect to have greater control over the situation when the Neumann series can be 
brought into the playground when solving a fixed-point problem. We now look again 
at some of the differential and integral equations we solved above and witness the 
enhancement due to the strong convergence. 

Example 5. 9 Consider the linear Fredholm equation 

b 

x(t) = X j ds Kit, s)x(s) + fit) = XAx + f 

a 

with K ( t , .v), known as the nucleus, continuous in the square [a, b] x [«, b\, and fit) 
continuous in the interval [a,b]. Since 

\K{t, s)\ < M || A || < \X\M(b — a) = q 

we see that for those values of X such that q < 1 , the Neumann series 

OO 

k = 0 

of A converges (in the operator norm). Here A 0 = id = / and 
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where 


A" 


b 

j ds K\(t , s)(*), A 2 

a 


/ 


ds K n (t, s)(*), . . . , 


a 


b 

J ds K 2 (t,s)(*), 

a 


Ki(t,s ) = K(t,s), 


K n 0, s) 


K 2 (t,s ) 




a 


du K(t , u) K(u, s) , . . . , 


b 

j du K n -i(t , m) ^(m, s), . . . 

a 


Notice further that for all r, s e [a , b\ 

\x‘Ki(t,s)\ < \X.\‘ M‘ (b - a) 1 <q‘, 
and, since o<q < 1 by assumption, the series 


OO 

K(t, s; X) = ^ \'Ki(t,s ), 

i=l 

converges uniformly in [a, b] x [a, b]. The function K (t, s\ a) is called the resolvent 
nucleus and is a continuous function in the square [a, b] x [a, b\ (as a uniform limit 
of continuous functions). From here it follows that 

b 

x(t) = f(t) + J ds K(t,s;X)f(s). 

a 

Example 5.10 Consider the Fredholm integral equation 


7r/2 

t 1 f 

x(t) = sin t — - + - / ds ts x(^). 

o 


With reference to the previous example, we have the estimates 

it 2 it 1 1 8 

\K(t,s)\ = \ts\< (b — a) = —, X=-<— - = — 

4 2 4 M(b — a) n 5 
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and thus we can perform the same series expansion as above. The resolvent nucleus 
is now given by 


where 


K(t, s; X) = XKi(t, s ) + X 2 K 2 (t, s ) + X 3 K 3 (t, s) H , 


n/2 

K\(t,s) = ts, K 2 (t,s) = / du (t u) (u s) = a t s, 


n/2 


n/2 


K 3 (t,s) = / du K 2 (t , u)K\(u, s) — a / du (t u) (u s) = at s, . . . 


n/2 


K n (t,s)=a n + with a= I du u A — it 3 /24. 


2 _ ^3 . 


The expansion of all nuclei is given by: 

K(t, s; X) = X s + Xa t s + ( Xa) 2 t s + • • • 


= X t s (1 — Xa) \ 


since Xa = TT' 1 / (4.24) < 1. The solution of our problem is then 

tt/2 

t i r l / s\ 

x(t) — sinr 1 — / ds ts ( sin 5 1 = 

4 4 J 1 — a/4 V 4/ 

(ssins-j). 


tz/2 


t t 

sin t 


4 4 — a 


ds 


Since 


Tt/2 

n/2 

f . 

71/2 f , 

/ s sin s = —s cos s 

+ ds coss = sins 

J 

0 J 


n/2 


= 1 


we finally obtain 

t t ( a\ 

x(t) = sin t 1 1 1 

4 4 — a V 4/ 


t t 

— I = sin t 1 — = sin t. 

4 4 
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Let us now reconsider the Volterra equation, in light of the more powerful technique 
of the Neumann series. 

Example 5.11 Consider in C ([a , b ] , M) the linear Volterra equation 


x{t) = X / ds K(t, s ) jc(s) + /(f) = XAx + /, 


with the usual assumption of continuity and boundedness on K(t, s) and /(f), and let 

M= max |A/f, 01, F= max |/(f)|. 

{r,i}e[a,6] t£[a,b\ 

Compared with the previous treatment of the Fredholm equation by means of point- 
wise fixed-point solutions, here the integral does not extend over the entire interval 
[a, b] but only up to f, which in turn becomes the integration variable of the next 
iteration. Since usually integration makes functions smoother , we expect to obtain 
better conditions on the values of X for which the iterative solution is converging. 
Then, consider the solution 


OO CO 

x(t ) = ^VA'/ = 2>>‘-’ 

i =0 i =0 

which certainly exists at least for those values of X found for the linear Fredholm 
equation. The following estimates hold 

t 

101 (01 < J ds | K{t, s) f(s ) | < M F (t — a), 

a 


102(f)! < 


j ds \K(t, s) 0i (01 <M 2 F 


j ds (s — a) — M 2 F(t — a) 2 /2, . . . , 


a 


a 


I0i,l < m"f 


(f - a )" 
n\ 


It follows that 


co co co / \i 

\X\‘ M 1 - = F exp [ |L| M (f — a) ]. 

i =0 i =0 i =0 


is convergent for all values of X. Therefore, the function x(t ) is continuous and is a 
solution of the Volterra equation for every X. 
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Example 5.12 Let us see how the linear homogeneous Fredholm equation admits 
solutions other than the zero solution. Given the equation 

l 


x(t ) = X / ds (t + s) x(s), 


we introduce the constants ci and C 2 given by 


l l 

x(t ) = Xt ds x(s) + X / ds sx(s) = Xt ci + Xc 2 - 


o o 

By inserting this form of x(t) into the first equation, one obtains 


l 



o 

The coefficients of similar terms in t must be equal, so that 


ci = X c\/2 + X C 2 
c 2 = Xci/3 + XC 2 / 2 . 


The two equations are compatible for X — 6 ± 4>/3 and the system, since it is 
homogeneous, admits infinitely many solutions, namely 

2X 


with c an arbitrary parameter. The non-trivial solution of our equation is then given by 

/ 2 Xt \ 

up to an arbitrary proportionality factor. Obviously, |A.| = |6 ± 4^3 >1/2, below 
which value the solution is unique. 

Remark 5.5 In the previous examples the L 0 0 norm has always been assumed. This 
was done in order to allow for the machinery developed thus far. It should be noted 
though that this does not exclude the existence of non-continuous solutions, i.e., ones 
that we have excluded a priori. 
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Example 5.13 Consider the homogeneous Vol terra equation 

t 



o 


0 < t < 1. 


The nucleus K(t, s) = s'~ s is continuous in the integration domain with s < t, so 
that the equation admits the unique continuous solution x = 0. However, there exist 
infinitely many solutions not belonging to C ([a, b ], R), given by x(t) = c f r_1 with 
c an arbitrary constant, as can be checked directly 

t t t 

dss- x (*) = J<,ss<-’s-'=J l lss-' = S j 
0 0 0 

As expected, the function x(t) is singular, since x(t) — > oc as f — > 0. 

Exercises 

Exercise 5.21 Prove that if the Neumann series of an operator A : V V , where 
V is a normed space, converges, then A m — > 0 in the operator norm. 

Exercise 5.22 If A : V — > W is an injective linear operator between normed spaces, 
does it follow that || A|| > 0? 

Exercise 5.23 Show that linear operators can be as wild as one desires by showing 
that given any function / : X —> Y, where X is an arbitrary set and Y is a linear 
space over K , there exists a linear space X and an injection i : X — > X together with 
a linear operator F : X — »■ Y such that / = F o i. 

Exercise 5.24 Consider the space R with its standard inner product structure and 
induced norm. Given any net show that the function i/r : R — > M given by 



\[r(x) = ax 


is a bounded linear operator. What is its norm? How many such operators have norm 
equal to 1 ? 

Exercise 5.25 Repeat the previous exercise with C instead of R. 

Exercise 5.26 Let SS be a Banach space and 1 the identity operator on 38. Prove 
that any linear operator A : 8$ — ► S3 satisfying || A — / 1| < 1 is invertible. 

Exercise 5.27 Let 88 be a Banach space and I the identity operator on 88. Suppose 
A : 88 — »■ 88 is a linear operator with || A — / 1| = 1, is A necessarily not invertible? 

Exercise 5.28 Let B : 88 — > 88 be an invertible linear operator on a Banach space 
88. Prove that any linear operator A : 88 —> 88 satisfying ||A — Z?|| < ||B|| is 
invertible. 
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Exercise 5.29 Let B : 88 —> 88 be an invertible linear operator on a Banach space 
88. If A : 88 — »■ 88 is a linear operator satisfying || A — Z? || > || B || , is A necessarily 
not invertible? 

Exercise 5.30 Let A : 88 — ► 88 be an invertible linear operator on a Banach space 
88. Prove that given s > 0 there exists a S > 0 such that ||B _1 — A -1 1| < s for all 
linear operators B : 83 — > 8$ satisfying | B — A|| < 5. 


5.4 Dual Spaces 

Let us briefly recall the fundamentals of duality for finite dimensional linear spaces, 
where for simplicity we take the ground field to be R. The dual space of a finite 
dimensional linear space V is the linear space V* = Hom( V, R) of all linear oper- 
ators A : V — > R, where R is viewed as a linear space over itself in the usual way. 
Given a basis for V , one can associate with it its dual basis, which is then a basis for 
V*, in particular showing that V and V* have the same dimension, and thus they are 
isomorphic. 

When considering these results in the infinite dimensional setting there are two 
immediate issues. The first is in the choice of linear operators V R, namely do 
we consider all linear operators or only the bounded ones. The second issue is that 
results about the dual space that depend on a basis, for most infinite dimensional 
linear spaces, are only theoretical (since bases, while they exist, are often impossible 
to exhibit explicitly). 

With that in mind, we define the dual space in the infinite dimensional context and 
we prove three major results. First is the Riesz Representation Theorem which gives 
a complete characterization of the dual space of any Hilbert space. Then we compute 
some duals related to the l p spaces, and finally we have a look at the Hahn-Banach 
Theorem (necessarily only a brief look, since a full treatment can easily fill an entire 
chapter). 


5.4.1 Linear Functionals and the Riesz Representation Theorem 

We now define linear functionals and the dual space of a general linear space, study 
several examples, and then establish the most general form of a bounded linear 
functional on a Hilbert space. 

Definition 5.10 A linear functional on a normed space V is a linear operator V — »■ K 
to the ground field K, thus either K = R or K = C. The set of all bounded linear 
functionals F : V —*■ K is denoted by V* and is called the dual space of V. 

Of course, the general theory of linear operators specializes to linear functionals. In 
particular, 
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1. The norm of a linear functional F e V* is given by 

||F|| = sup |F(x)| 

MI<1 

(or any of the other equivalent expressions given in Proposition 5.2). 

2. A linear functional is continuous if, and only if, it is bounded. 

3. V* is a Banach space. 

Example 5.14 Consider C([n,h],K) with the L m norm, i.e., ||x|| = max rs [ ai £] 
{|x(f)|}. The definite integral 


F{x) = dt x(f) 


defines a linear functional on V . Indeed, linearity follows by fundamental properties 
of integration, and as for the norm of F, the inequality 


|F(x)| < dt |jt(f)| < (b — a) max \x(t)\ = (b — a)||x| 

/ a<t<b 


shows that ||F|| < b — a. In fact, consideration of the constant function x(t) — 1, 
shows that ||F|| = b — a. 

Example 5.15 The previous example easily generalizes as follows. Still consid- 
ering C([a, b], R) with the L a 0 norm, suppose a function y(t) is given which 
is integrable over [a,b]. Then, integrating against y gives rise to the function 
F y : C([a, b], K) — >■ M given by 


b 

Fy(x) = j dt y(t) x(t) 
a 

(the previous example is the case y(t) = 1). The verification that F y is linear follows 
immediately and the computation showing that 

b 

imi = J dt |y(f)l 

a 


is left for the reader. 

For a real Hilbert space Jf? it is a non-trivial fact that the dual space Jff* and the 
original space J4? are essentially the same, namely Jf? is self-dual. Before we can 
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establish this result we need to first look at linear functionals on inner product spaces 
and study perpendicularity in Hilbert spaces, an important and non-trivial issue. 

Proposition 5.4 Let V be an inner product space over the field K and let y e V. 
The function F y : V — > K given by F y {x) — (y, x ) is a bounded linear functional 
with ||F y || = ||y||. In particular, F y e V*. 

Proof The linearity of F y is immediate from the definition of an inner product space. 
Using the Cauchy-Schwarz Inequality, it follows that 

|F y (x)| = |<y,x>|<||y||||x|| 

which thus shows that ||F y || < ||y||. Considering the case where x — y shows that 

\\Fy\\ = llyll. ' ' □ 

A linear functional of the form F y is said to be representable, with y representing it. 
For every inner product space V , we thus obtain a natural candidate for a comparison 
between V and its dual V*, namely the function <p : V — > V* given by fi(y) = F y , 
i.e., mapping every element y e V to the bounded linear functional it represents. 
This function is (easily seen to be) an injection, and so V always embeds in its dual 
space. However, f need not be surjective, i.e., not every bounded linear functional 
on V need be representable. In this and in similar situations, a result establishing 
conditions under which </> is surjective are thus called representability theorems. The 
one we present below, namely that if V is a Hilbert space, then tf> is surjective, is one 
of many famous representability theorems due to Riesz. We first need to establish 
some general geometric results in Hilbert spaces. 

Theorem 5.15 Any non-empty subset S C .i/F of a Hilbert space -FA' which is closed 
and satisfies (x + y ) /2 e S for all x, y G S, contains an element of smallest norm. 

Proof Consider the real number <5 = inf{||x|| | x e S’), which exists since S ^ 0.We 
need to show that the infimum is attained. There certainly exists a sequence {xk)k > i 
in S with ||x*|| —*■ 8. It suffices to show that xk —*■ xq for some xo e Jtf. Indeed, 
the continuity of the norm would then imply that ||xo II = 8 and since S is closed, the 
limit point xo must itself be in S. Since Jff is complete the convergence of {x* }*> i 
would follow if we can show that the sequence is Cauchy. To that end, recall the 
parallelogram law 


x + y \\ 2 + \\ x — yll 2 = 2||x|| 2 + 2||y|| 2 


which holds in any inner product space. Let now x, y e S and apply the parallelogram 
law to x /2 and y /2, to obtain 


1 

4 


Ik-3'11 2 



2 - ir^Zii 2 
2 


Since (x + y)/2 e S by assumption, it follows that 
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||x-y|| 1 2 3 4 5 < 2||x|| 2 + 2||y|| 2 — 4S 2 . 

Applying this inequality to the elements of the sequence {xf\k>\ shows that it is 
indeed a Cauchy sequence, as required. □ 

For the next result recall that in an inner product space V the notation x 1 y stands 
for (x,y) = 0, and x i S, for a subset S C V, stands for (x,y) = 0 for all y e S. 
The geometric interpretation of x _L y is that x and y are perpendicular. 

Theorem 5.16 Let be a Hilbert space and S C JiL a proper closed linear 
subspace ofJrff. There exists then an element z ^ 0 such that z -L S. 

Proof Since S is a proper subset of Jf, let x e Jff — S. Consider the set 

i + S = jx + y|yeS} 

and notice that it satisfies the conditions of Theorem 5.15. Thus let z be an element 
of minimal norm in the set x + S. Note that x f. S implies z 0, and we claim that 
z -L S. Indeed, let s e S be arbitrary, where we may assume ||s|| = 1. Noting that 
z — as e x + S for all a in the ground field (either R or C), the minimality of the 
norm of z implies that 

llzll 2 < ||z — Q'i'll 2 = (z — as, z — as) = ||z|| 2 + |q!| 2 ||5|| 2 — a{z, s) - a(s, z) 
which simplifies to 


0 < \a\ 2 — a(z, s) — a(s, z). 


Taking a — (s, z) leads to 

0 < \a\ 2 — aa — aa = — II “ II 2 = — (s, z) 2 
and the result follows. □ 

We can now prove the Riesz representation theorem. 

Theorem 5.17 (Riesz Representation Theorem For Hilbert Spaces) For a Hilbert 
space over the field K the function (f> : given by 4>(y) = F y , where 

F y (x) = (y, x), is an isometric (anti) isomorphism. In more detail: 

1. (p is a bijection. 

2. (p is norm preserving, i.e., ||</>(y)|| = \\y\\for all y e JiF. 

3. (p is additive, i.e., cp(y + y') = <p(y ) + 4>(y') for all x, y e Jtf. 

4. If K — R, then (p(ay) — acp(y). 

5. If K — C, then (p(ay) = acp(y). 
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Proof The validity of the last three claims is immediate from basic properties of 
the inner product. The fact that (p is norm preserving was already established in 
Proposition 5.4, and in particular (p is injective. It thus only remains to show that <p is 
surjective, in other words that every element in the dual space is of the form F y 
for some y e Jt? (i.e., is representable). This is the only part of the theorem where 
the completeness of Jff plays a role. 

Let F e Jif* be an arbitrary bounded linear functional on J$f. If F — 0 the 
claim is evident so we may proceed under the assumption that F ^ 0. Consider 
the kernel S = ker(F) = {x e Jif \ Fx — 0} = 1 ({0}) . Then S is a linear 

subspace of and in fact a proper one by the assumption that F f 0. Further, 
since F is continuous, S is also closed (as the inverse image of the closed set {0}). 
By Theorem 5.16 there exists a non-zero y e Jif such that y _L S. In particular, 
F(y) ^ 0 and let z = y/F(y), and note that F(z) = 1. Now, for all x e Jif 

F(x — F(x)z) = F{x) — F(x)F(z) — 0 
and thus x — F(x)z e S. Since z is perpendicular to S it follows that 
0=(z,x-F(x)z) = (z,x)-F(x)\\z\\ 2 . 

Let h = z/||z|| 2 and then it follows that 


F(x) = (h, x), 


as required. 


□ 


5.4.2 Duals of Classical Spaces 

For a real Hilbert space the Riesz Representation Theorem establishes that is 
isomorphic to its dual .7^'*, and moreover the isomorphism preserves the norm. This 
prompts the following definition. 

Definition 5.11 A function A \ V —> W between two normed spaces is a linear 
isometry if A is a linear isomorphism which respects the norms in the sense that 
|| Ax || = || jc || for all x e V. Normed spaces V and W are said to be congruent if 
there exists a linear isometry between them. 

One can easily verify that a function A : V -* W is a linear isometry if, and only 
if, it is a linear isomorphism which is also an isometry with respect to the induced 
metrics on V and W. 

The following theorem establishes classical congruences for the family of i p 
spaces. 

Theorem 5.18 (Riesz) For the claims below all linear spaces are of real sequences, 
and thus are linear spaces over the field R. 
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1. C*=h. 

2. l\=lo O. 

3. l* p = l q for all 1 < p, q < oo such that 1/p + 1/q — 1. 

Proof 

1. Firstly, we construct a function A : Cq —> 1 1 , for which we use the standard 
unit vectors ek = (0, . . . , 0, 1, 0, . . .) e cq, with 1 in the k-th position. Notice 
that {ek}k> l is not a basis for c o- Given a linear functional F e Cq let AF be the 
sequence (Fe \ , Fe 2 , . . . , F ek, ■ ■ ■), and denote ak — F ek . Firstly, we claim that 
AF e 1 1 , so that the codomain of A is indeed l \ . 1b show that AF e 1 1 we need 
to establish that 

OO 

X! \ a k\ < °°- 

k= 1 


For any ael, let a(a ) denote the sign of a , i.e., a(a) = a/\a\ (with cr(0) = 0) 
and consider 

x (m) — (cr(ai), . . . , a(a m ), 0, 0, 0, . . .). 

Since \F{x^ m ' ) )\ < ||F’||||x (m) || = ||F|| < 00 , and noting that 

m 

F (x (m, ) = 2>*l, 

Jt=l 


it follows that 

m 

X \ a *\ - II < 00 

k= 1 


and the same inequality remains valid when passing to the limit as m —> 00 . 
Having defined A : Cq —>■ the verification that A is linear is immediate, and 
thus it remains to show that A is surjective and that || AF||i = ||F||. Indeed, for 
surjectivity let a e £ 1 , i.e., 

OO 

^ Wk\ < oo- 

k= 1 


If we now define FI : co —*■ M by means of the formula 

OO 

Hx = Z akXk , 

k= 1 


then, formally. 


Fe k = ak 


and thus 
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AH = a. 

Thus, it remains to show that Hx is well-defined. Indeed, 

oo oo 

y. \akXk\ < sup \xk \ y \ak\ < 11-^ II oo ll« II 1 

*=1 k = 1 


and so the series defining Hx is absolutely convergent, and thus convergent. The 
linearity of H is obvious and the fact that it is bounded follows from the estimate 
above. It thus follows that indeed H e Cq , and, as noted, AH — a, so that A is 
surjective. Combining the estimates above it also follows that ||F|| = ||AF||i, so 
A is an isometry. 

2. The details of the proof are similar to the proof of the first claim, so we leave this 
proof to the reader. 

3. The details are again similar to those given above, but this time some adaptation is 
required in the construction, with some complicating consequences in the needed 
estimates. Given F e l* p let AF = (Fe i, . . . , Fe k ) where {e k }k > l are again the 
standard unit vectors, and denote again a k = Feu - Considering 

x (m) = (|fli| 9_1 cr(a 1 ), . . . , \a m \ q - [ o(a m ), 0, 0, 0 . . .) 


we obtain 


( m 

k= 1 

In other words 

(i>i* 

\k-\ 

and thus ||a|| 9 < || C| < oo. This part of the proof then guarantees that 
A : i* p —> l q has the claimed codomain. The rest of the proof proceeds as the 
one above, namely defining for a given a e £ q the function // : £ p —> R by 
means of the formula 

OO 

H(x) = Ya k x k , 

k= 1 



z 


i Ip 

= 0^11 (\ak\ q ) l,P ■ 


where Holder’s Inequality is used for the relevant estimates. □ 

Remark 5.6 Similar congruences hold true for L p spaces. However, the proofs rely 
on non-trivial measure theoretic results, and thus lie beyond the scope of this book. 
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5.4.3 The Hahn-Banach Theorem 

The main result we now present is a general result about normed spaces, valid not 
only for Banach spaces. We will present some of its consequences that further relate 
a space and its dual. 

Theorem 5.19 (Hahn-Banach) Let V be a normed space over K, with either K = R 
or K — C. Suppose that U C V a linear subspace, and f : U -> K abounded linear 
functional on the subspace U . There exists then a linear functional F : V — * K such 
that F\u = f and such that ||F|| = ||/||. 

Proof For simplicity, we will only give the proof for real normed spaces. Thus we 
assume that a bounded linear functional f : U — > R is given. Clearly, if / = 0, 
then F = 0 is the desired extension to all of V . We may thus proceed under the 
assumption that / ^ 0, and thus, normalizing if needed, that || / 1| = 1 . Suppose that 
xq e V — U and let U\ be the span of U U {xq). It is easy to see that U\ consists 
of all vectors of the form x + fix o, where x e U and fi e R. Notice that, given any 
ael, the function 

/lO + fix 0 ) = fix) + fid 

is a linear functional on U i and it extends U. We will now show that an a e R can be 
chosen so as to assure that || f\ || = 1 . By the definition of the norm || f\ || , it suffices 
to guarantee that 

I/O) + fid | < ||x + fixfl || , 

for all x e U and fi e R. Stated differently (replace x by — fix and divide both sides 
by \fi\), we seek an d such that 

I fix) - a\ < \\x - xo||, 

for all x e U. We thus wish to find an a e R with 

fix) — ||x — xoll < d < f(x) + || jc — JtoH, 

for all x e U. In other words, a is common to all the intervals [A x , B x \, for all 
x e U, where A x = fix) — ||x — xo|| and B x = fix) + ||x — xo||. The existence of 
such an d will follow by showing that A x < By for all x, y e U. And indeed, 

By - A x = fiy) - fix) + || y - x 0 || + \\x - x 0 || 


and since 

1/00 - /0)l = I/O - x)\ < ||y - x\\ < ||y - xoll + 10 - *o 


it follows that 
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By - A x > 0 . 

We thus established that /, and, importantly, that in fact any linear operator defined 
on a proper subspace, can be extended to the span of its current domain with one 
additional vector added, without altering the norm. 

The stage is now clear for an application of Zorn’s Lemma to further extend f \ , 
one dimension at a time, until we arrive at a linear operator whose domain is the 
entire space V. Let P be the collection of all pairs ({/', /') where U' 3 U is a linear 
subspace of V and /' : U' — > R. is a linear functional extending / and satisfying 
|| /'|| = 1. Introduce a partial order on P by means of 

/') < (U", /") 

precisely when U r C U" and when /" extends /'. The fact that P is then a poset, 
is immediate. We must now establish the conditions of Zorn’s Lemma. Firstly, P is 
non-empty since (U, /) is a member of P . Next, given a chain {({/,-, /)}; 6 / in P, 
consider U' = U/ 6 , Ui and define /' : U' — »■ R. as follows. Given u e U 1 there 
exists an index i e I with u e U,. If j e / is another index with u e Uj, then, 
since we are given a chain, either /,• extends fj or fj extends /, . In either case 
setting f'iu) = f,(u) = fj{u) produces a well-defined function. The function f 
is a linear functional. Indeed, if u\, U 2 e U', then u\ e Uj l and U 2 e C// 2 for some 
1 1 , *2 £ I- But, by the chain condition, we may assume without loss of generality 
that U, l C Ui 2 . Thus 

/'(Ml + M 2 ) = fi 2 (u 1 + u 2 ) = fi 2 (u\) + fj 2 (m 2 ) = /'(mi) + /' (m 2 ) 


and thus /' is additive. A similar argument shows that / is also homogenous. We 
leave it to the reader to similarly show that ||/'|| = 1 and thus it is now clear that 
([/' , /') is in P and is an upper bound for the given chain. Note however that there 
is no guarantee that /' is the desired linear functional F, since U' may be properly 
contained in V. 

With the conditions of Zorn’s Lemma now verified, the existence of a maximal 
element {Um, F) e P is guaranteed. If Um = V, then F is a linear functional on all 
of V, it extends /, and its norm is 1, as required. Assume thus that Um is properly 
contained in V, and let xo e V — Um- But, at the beginning of the proof we saw that in 
such a situation F can be extended to the linear subspace U\, the span of Um U {xo}, 
without altering the norm. That would give rise to an element (U\, /) e P with 
( UmF ) < (U\, /), an impossibility. The proof is thus complete. □ 

There are numerous consequence of the Hahn-Banach Theorem, of which we only 
explore a few. 

Theorem 5.20 Given a norrned space V ancl xo e V with xo ^ 0, there exists a 
linear functional F e V* such that ||F|| = 1 and F(x o) = 1 1 xq 1 1 - 


Proof Let 
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Vo = {a*o | a e C} 

be the subspace of V spanned by xo and let Fq : Vq —> C be given by 


F 0 (ax 0 ) = a||xo||. 


Clearly, Fq is a linear functional, || foil = LandFo(xo) = ||xo||.BytheHahn-Banach 
Theorem, an extension F e V* exists, establishing the claim. □ 

Corollary 5.5 For all x e V 


11*11 = sup |F(x)|. 

FeVM|F||=l 

Corollary 5.6 For all vectors x in a normed space V, if F(x) — 0 for all F e V*, 
then x = 0. 

Theorem 5.21 Let V be normed space and U a linear subspace of it. Given xo £ U, 
it holds that xq is in the closure of U if and only if, there exists no bounded linear 
functional f on V with the property that f (x) = 0 for all x e U but /(xo) ^ 0. 

Proof Suppose that xo is in the closure of U . There exists then a sequence {«„)„> i 
of element in U converging to xo- Remembering that a bounded linear functional 
/ : V — > K is in particular continuous, it follows that f{u n ) —*■ /(xo). But then, if 
/(x) = 0 for all x e U, it follows that /(x o) = 0. In the other direction, suppose xo 
is not in the closure of U. Then there exists a positive 8 such that ||x — xo|| > 8, for 
all x e U. Consider the span U' of the set U U {xo}. Define now /i(x + /xo) = / 
(remember that the elements of the span are of the form x + /xo, with / e K and 
x e U). This is clearly a linear functional on U\ and since 

<5|/| < I / 1 1 1 x 0 T ~tc || = || /x 0 T x || 

/ 

we have that ||/i|| < l/S. Since, by definition, /i(x) = 0 for all x e U and 
fi (xo) = 1, applying the Hahn-Banach Theorem to extend /' to all of V establishes 
the claim. □ 

Exercises 

Exercise 5.31 Let V be a normed space. Prove that V* is bounded if, and only if, 
dim(V) = 0. 

Exercise 5.32 Apply the Riesz Representation Theorem to R and C, each with its 
standard inner product, and give an explicit construction of the function / , witnessing 
its stated properties. 

Exercise 5.33 Let V be a normed space and / e V*. Prove that pf : V — ► R given 
by pf(x) = | / (x ) | is a semi-norm on V. Under which conditions on V can it be a 
norm? 
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Exercise 5.34 Complete the proof of the fact that, as real spaces, t* = 1^. 

Exercise 5.35 Does V* = V hold for general normed spaces? 

Exercise 5.36 Let {xi , . . . , x n } be n linearly independent vectors in a normed space 
V, and zi, ... ,Zn € C. Prove the existence of a linear functional F e V* with 

F(xk) = Zk 


for all 1 < k < n. 

Exercise 5.37 Consider the space C([a. b], K) with the L m norm. Given an inte- 
grable function y : [ a,b ] —> K, prove that the function F v : C([a, b], R) — »■ R 
given by 

b 

F y (x ) = J dt y(t)x(t) 

a 

is a linear functional and show that 

b 

Ilf II = j dt\y(t)\. 
a 

Exercise 5.38 Show that the extension guaranteed by the Hahn-Banach Theorem 
is, generally speaking, far from unique by presenting a linear operator with infinitely 
many extensions. Proceed in two ways, one by analyzing the proof of the Hahn- 
Banach Theorem, the other by giving explicit extensions of a well-chosen linear 
operator (hint: consider finite dimensional spaces). 

Exercise 5.39 In the previous exercise you constructed explicit extensions of a linear 
functional. Convince yourself that for infinite dimensional linear spaces it may be 
very hard, if at all possible, to obtain an explicit extension of an arbitrary functional. 

Exercise 5.40 Use the Hahn-Banach Theorem to prove that it is possible to assign 
to every bounded sequence {x m ) m >\ of real numbers areal number x m such 

that 

1 . If {x m } is a convergent sequence, then lim ;;! _ i . 00 x m is the limit of the sequence in 
the usual sense. 

2. limm-^oo (x m + y m ) = lim m _ >00 .x m + lim m ^ l30 y m for all bounded sequences 
{Xm}m>l and { \'m \m>\ • 

3. linim-s.oo ax m = a lim, )! _ >00 x m for all bounded sequences {x m } m >\ and all 
scalars a el. 


Can you determine lim m _ >00 (— 1)'"? 
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5.5 Unbounded Operators and Locally Convex Spaces 

The theory that was presented so far is primarily concerned with the theory of bounded 
linear operators between normed spaces. In this section we present two directions for 
generalizations of the theory. The first one stems from the observation made above 
to the effect that while bounded operators encompass a lot of examples of interest, 
some important operators, the differentiation operator for instance, are not bounded. 
One is thus compelled to consider unbounded operators. The second generalization 
we consider originates from the fact that at times the space one is considering does 
not support a norm. It turns out that if instead of a norm one has a suitable family of 
semi-norms, then one can still recover much of the general theory. Spaces with such 
a family of semi-norms are called locally convex spaces, and we will only investigate 
the definition and some illustrative examples, without delving deeper into the general 
theory. 


5.5.1 Closed Operators 

The development of the theory of unbounded linear operators was motivated by the 
unbounded nature of the differentiation operator, as well as the development of a 
mathematical framework for Quantum Mechanics (e.g., to account for unbounded 
observables). Arbitrary unbounded operators may be too wild to tame, and so some 
restrictions must be placed in order to identify a suitable class of operators for which 
a reasonable theory emerges. The closed operators, presented below, form a class 
of operators broader than the bounded ones admitting a strong general theory that 
allows one to analyze situations that lie out of the rich of the theory of bounded 
operators. The results we present below are only the tip of the unbounded iceberg. 

In the context of unbounded linear operators it is common to re-interpret the 
definition of an operator between linear spaces to be one that is not necessarily 
defined on all of the specified domain. 

Definition 5.12 Let 88 \ and 88 2 be Banach spaces. An ( unbounded ) linear opeartor 
A : V — > W consists of a linear subspace 3>{A) C V and a function A : $>(A) — > W 
which is additive and homogenous. Further, A is a closed linear operator when the 
following condition holds. For every sequence {x m } m >\ in V if x m — > xq for some 
xq eV and Ax m — > yo for some yo e W , then xo e 3>(A) and Axo = yo- 

Example 5.16 In Example 5 .4 we already saw that the differentiation operator d/dt : 
C'dO, 1], R) — > C([0, 1], M), with the L m norm, is an unbounded linear operator. 
In the context of unbounded operators we may thus consider 

d/dt : C([0, 1], R) -> C([0, 1],M), 

with the L 0 Q norm, where we may specify various different subspaces as the domain 
of definition. Different choices for the actual domain may lead to completely different 
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properties of the operator. For instance, if we take the domain to be C 1 ([«, b], R), 
then we obtain an unbounded closed operator. Indeed, this claim follows from the 
elementary results that if {x m } m >i is a sequence of functions that converge (point- 
wise) to jto, and their derivatives x' m converge uniformly to yo, then xo is differentiable 
and x' Q — yo- In contrast, taking the domain of d/dt to be C°°([ 0 , 1 ], R) yields a 
non-closed operator. 

The defining condition of a closed operator A : 88 \ — > 88 2 can be restated in terms 
of its graph r(A) = {(x, Ax) e 88\ x 882 \ x e 88(A)}, as follows. 

Proposition 5.5 An unbounded linear operator A : 88\ — > 8$ 2 between Banach 
spaces is closed if, and only if the graph r(A) is closed in 88 \ x 88 2- 

The proof is left for the reader. We now prove that a closed linear operator is auto- 
matically bounded, provided its domain is closed. 

Theorem 5.22 (Banach’s Closed Graph Theorem) Let A : 88 1 882 be a closed 

linear operator between Banach spaces. If the domain S' {A) is closed in 88 then 
A is bounded. 

Proof First, it is straightforward to verify that the space 88\ x 882 with norm given 
by ‘ 

IK*, y)ll = 11*11 + llyll 

is a Banach space. Since A is closed, its graph r(A) C 88\ x 882 is thus a closed 
subset of a complete space, and thus is itself a Banach space. Similarly, by assumption, 
88(A) is closed in 88\ and thus is a Banach space too. 

We now consider the operator P : CM) — > 88 (A) given by 

P(x , Ax) — x, 


clearly a linear operator. Since 

\\P(x,Ax)\\ = ||x|| < || jc || + || Ax || = || (x, Ax)|| 

we see that P is a bounded linear operator. Moreover, P is invertible (its inverse P~ l 
obviously given by x i->- (x, Ax)) and thus, by Corollary 5 . 2 , P~ l is bounded, say 
by M. Therefore, 


P ^ x || = || x || + || Ax || < M ||x| 


for all x e 88(A), and thus 


l|Ax|| < (M- l)||x|| 


showing A is bounded. □ 

We note that this is yet another deep result in the theory of Banach spaces, seeing 
that this rather short proof uses a corollary of the Open Mapping Theorem. 
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5.5.2 Locally Convex Spaces 

As motivation for the concept we now present, consider the set K°° of all sequences 
of real numbers (with obvious modifications one may also consider C°°). Of course 
the linear structure of M carries over to M°° by point-wise operations and thus R°° 
is a linear space. However, there is no natural way to turn M°° into a normed space. 
In particular, each l p norm is only defined on a proper subset of R°° and can not be 
extended to a norm on R°° . There is however a natural choice of a family of semi- 
norms, defined as follows. For each A: e Nlet pk : V —>■ K be given by pk(x) = \xk\, 
the absolute value of the fc-th component of x. It is easily seen that pk is indeed a 
semi-norm. 

We thus see that in the absence of a norm a space may still admit a family of 
semi-norms and it is the case that, under suitable conditions, significant portions of 
the theory developed above (in particular the Hahn-Banach Theorem) have suitable 
analogues in locally convex spaces. We will now present the definition and explore 
some of the most basic aspects of these spaces. 

Definition 5.13 A locally convex linear space is a linear space V together with a 
family { p,- j, e / of semi-norms on V. The family of semi-norms is said to separate 
points if from /?, ( x) = 0 for all i e / it follows that x — 0. 

Example 5.17 Of course any normed space and any semi-normed space is a locally 
convex space. It is not hard to show that any locally convex space given by a finite 
family of semi-norms is in fact semi-normed. Thus the concept of locally convex 
spaces becomes substantial only when once considered infinite families of semi- 
norms, as in the case of K°° above. 

We now consider the topology induced by the family of semi-norms in a locally 
convex space. 

Definition 5.14 Let V be a locally convex space given by the family {/?, };<=/ of 
semi-norms. For each i e I and y e V let j\ y : V —> R. be the function 


fi.y(x) = pi(x - y). 


The induced topology on V is the smallest topology such that each of fi y : V -* R. 
is continuous. 

The existence of such a smallest topology is guaranteed by Proposition 3.1 1. In fact, 
from the proof of that proposition, a local base at y for the induced topology is given 
by the collection t/g i£ , where IS C {p, } iS / is a finite subset of the given family of 
semi-norms and e > 0, and 

Ub.b = {x e V I Pi(x - y) < e Vi e B). 

The following result now follows easily, and thus the proof is left for the reader. 
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Proposition 5.6 Let V be a locally convex space given by the family { p, },<=/ of 
semi-norms. 

1. A sequence {x m } m >\ in V converges to xo e V in the induced topology if, and 
only if pi (x m — xq) — >■ 0 for all i e I. 

2. The induced topology is Hausdorff if and only if the family of semi-norms 
separates points. 

3. The linear space operations are continuous. 

With these observations made, the resemblance between the theory of normed 
spaces we developed above and the theory of locally convex spaces with separated 
points is only starting to become visible. Without a doubt, it is more convenient to 
have a single norm and appeal to the rich theory of normed spaces whenever possible. 
However, often enough a norm is simply not available. The theory of locally convex 
spaces is rich enough so as to justify the somewhat cumbersome management of 
a family of semi-norms instead of a single norm. As mentioned above, significant 
portions of the theory of (semi-)normed spaces transfers quite smoothly to the theory 
of locally convex spaces, however we do not explore this any further. 


Exercises 


Exercise 5.41 Prove that if SB\ and SB 2 are Banach spaces, then so is SB \ x SB 2 with 
norm given by ||(x,y)|| = ||x|| + ||y||. 

Exercise 5.42 Prove that if A : SB\ — » 5$ 2 is a closed linear operator, then its kernel 
Ker(A) is a closed subspace of SB\. 

Exercise 5.43 Prove that any bounded linear operator is a closed operator. 
Exercise 5.44 Prove Proposition 5.5. 

Exercise 5.45 Let V be a normed space. We may now consider an induced topology 
on it in two ways. Namely, we may view it as a locally convex space where the family 
of semi-norms consists of just the given norm on V, and obtain the induced topology, 
or we may consider the induced metric d(x, y) = ||x — y j and consider its induced 
topology. Show that the two topologies coincide. 

Exercise 5.46 Prove that every locally convex linear space V defined using a finite 
family {/?, },•<=/ of semi-norms is semi-normed. In more detail, construct a single 
semi-norm on V which induces the same topology as the finite family of semi-norms 
does. 

Exercise 5.47 Let V be a locally convex space given by the countable family {p, }i> 1 
of semi-norms. Consider the function d : V x V — > R. given by 



Prove that ( V , d) is a semimetric space, and that it is a metric space if, and only if, 
the family of norms separates points. 
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Exercise 5.48 Continuing the previous exercise, show that the topology induced by 
the semimetric and by the family of semi-norms coincide. (You will need to define 
the topology induced by a semimetric.) 

Exercise 5.49 Consider R with the standard structure of an inner product space and 
let V — C(R, R) be the linear space of all continuous functions x : R -> R. For 
each t e R let 

p,(x) = |x(t)|. 


For each n e N let 


q n (x) = max |x(f)|. 

—n<t<n 


For each compact subset C C R let 

rc(x) = max |x(f)|. 
f€C 

For each of the families {p t } t e R, {^nlneN* and kclccR, where C ranges over the 
compact subsets of R, decide whether it endows V with the structure of a locally 
convex space, whether the family of semi-norms separates points, and compare the 
induced topologies. 

Exercise 5.50 Prove Proposition 5.6. 

Further Reading 

The hitchhiker’s guide to infinite dimensional analysis ([!]) offers a comprehensive 
source on all the topics covered in this chapter and far beyond. For a more concise 
introduction to Banach spaces, complete with a detailed preliminaries chapter, see 
[2]. For a classic introduction to functional analysis see [5], and for a text somewhat 
different in style, including an entire chapter devoted to unbounded operators, see 
[4]. To read more about locally convex spaces the reader may consult [3]. 
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Chapter 6 

Topological Groups 


Abstract This short chapter is a brief introduction to the theory of topological 
groups, set in the context of Banach space theory. Not assuming any knowledge of 
groups, the chapter is entirely self-contained, presenting the definitions of groups, 
homomorphisms, (normal) subgroups, and quotient groups before moving on to the 
definition of topological groups. Some fundamental consequences of the interaction 
between algebra and topology are discussed. Introducing uniform spaces, the main 
message of the chapter is that the topology of a topological group is nearly metrizable, 
in the sense that it is uniformizable. 

Keywords Group • Group homomorphism • Topological group • Uniformity • Quo- 
tient group • Subgroup • Normal subgroup • Abelian group 

Banach spaces admit a particularly rich theory since the algebraic structure (i.e., the 
norm) induces a particularly nice metric (i.e., a complete one) resulting in a powerful 
interaction between algebra and geometry. Quite often, when an algebraic structure 
and a topological structure are allowed to interact, the fusion of the two theories 
results in a very intricate and interesting new theory. Such a fusion is the focus of 
this chapter where we present topological groups. 

There are at least two motivating reasons for considering topological groups. The 
first one is that just as groups model symmetry, so do topological groups model 
continuous symmetry. The second reason for studying topological groups is that 
quite often a portion of, say, a Banach space may fail to be a Banach space on its 
own but it may still retain some of the algebra and the topology of the ambient space 
to form a topological group. This is often the case since a group is a much weaker 
algebraic structure compared to a linear space, and a topology is much weaker than a 
metric (whether or not induced by a norm). Topological groups thus are much more 
common-place since one only requires a group structure and a topology, rather than 
a full linear space and a metric structure, yet the theory of topological groups still 
presents a powerful fusion between algebra and geometry. 

Section 6. 1 introduces groups and group homomorphisms, without assuming any 
prior knowledge of groups. The main objects of study, namely topological groups, 
are then presented in Sect. 6.2. Section 6.3 presents topological subgroups and a first 
encounter with the interesting interaction between algebra and topology, namely that 
the closure of a subgroup is a subgroup. The quotient construction for topological 
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groups is the topic of Sect. 6.4, including a rather detailed account of normal sub- 
groups. Finally, Sect. 6.5 discusses another, deeper, consequence of the interaction 
between algebra and topology, namely the uniformizability of the topology of a 
topological group, thus enabling one to import significant portions of the uniform 
machinery of metric spaces into the realm of topological groups. 


6.1 Groups and Homomorphisms 

The axioms defining a group are an abstraction of structure present in many familiar 
algebraic systems, namely an associative and unital binary operation with respect 
to which every element is invertible. The historical development of group theory 
is rooted in the study of roots of polynomials, famously with the work of Evariste 
Galois and Niels Henrik Abel on finite groups. There is a distinct difference between 
the theory of finite groups and the theory of infinite groups. In finite group theory 
various counting arguments play an important role in determining the structure of 
groups. The recent classification of all finite simple groups, the result of an almost 
unfathomable collaboration spanning thousands of articles, is a milestone of finite 
group theory. Infinite groups though can be so wild that a full classification of all 
groups seems beyond reasonable expectations. Some limitations on size must be 
placed, and one way to impose such size constraints is through the introduction of 
a topology, and demanding that the group, as a topological space, be compact. The 
classification of compact Hausdorff topological groups is a much more manageable 
project. 

With that in mind, we turn now to present groups, but with an eye towards their 
common applications in Physics (i.e., as tools for the study of symmetry) and as a 
step towards defining topological groups (and thus finite groups are only glanced at). 


Definition 6.1 A group is a triple (G, -, e ) where G is a set, ■ is a binary operation 
G x G -* G, usually denoted by (g, h) \—r g ■ h, or even just (g, h) i->- gh, and 
e e G is a chosen element, such that the following axioms hold. 

1. Associativity, i.e., (gig 2 )g3 = gi(gig3) for all g i, g 2 , g 3 e G. 

2. The element e is an identity element, i.e., eg = g — ge for all g e G. 

3. Existence of invereses, i.e., for all g e G there exists an element, denoted by 

e G and called the inverse of g, satisfying gg~ l — e = g~ l g- 

If gh = hg holds for all g, h e G, then the group is said to be commutative or 
abelian. In an abelian group it is customary to write g + h instead of gh, and to 
denote e, the neutral element, by 0. The cardinality of G is called the order of the 
group. 

When there is no danger of confusion it is common to refer to a group G, leaving 
the operation ■ and the identity e implicit. 
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Proposition 6.1 In a group G 

1. the identity is unique, i.e., if e' e G has the property that e'g — g = g’ e for all 
g e G, then e’ — e; 

2. inverses are unique, i.e., for every element g e G, ifh \,lt 2 e G satisfy 

gh 2 = gh i = e = hig = h 2 g. 


then hi = h 2 . 

Proof The arguments are very similar to those given in the proof of Proposition 2.1, 
and thus we leave it to the reader to adapt that proof to the current situation. □ 

Remark 6.1 An inspection of the definition of linear space (Definition 2.1) reveals 
that we have already encountered abelian groups. Indeed, in that definition there are 
three sets of axioms, the first of which states, precisely, that (V, +, 0) is an abelian 
group for any linear space V . Thus, had we chosen to present groups prior to linear 
spaces, we could have defined a linear space as an abelian group together with a scalar 
product, satisfying the other two sets of axioms in the definition of linear space. 

Many examples of groups arise as the set of symmetries of an object. For instance, 
fixing an equilateral triangle in the plane, its symmetries correspond to the various 
ways in which rigid motions of the plane map the triangle onto itself. The set of 
all rigid motions that map the triangle to itself (of which there are precisely six: 
the identity, three reflections, and two rotations) is a group under the operation of 
composition. The verification is straightforward: the composition of rigid motions 
that fix the triangle is again a rigid motion that fixes the triangle, the identity function 
is a rigid motion that fixes the triangle (and thus serves as the identity element of the 
group), and the inverse function of a rigid motion that fixes the triangle is again a 
rigid motion that fixes the triangle. The resulting group is called the dihedral group 
of order 6, one member in the following family of groups. 

Theorem 6.1 Let P be a regular n-gon in the plane {i.e., a regular polygon with 
n vertices ), n > 2. The set D n of all rigid motions (i.e., functions of the plane to 
itself that preserve lengths and angles) together with the operation of composition 
of functions is a group. 

Proof The reader is invited to turn the argument for Di given above into a proof of 
the general case. □ 

Remark 6.2 The groups { D n } n > 2 are known as the dihedral groups. It can be shown 
that D n has order 2 n. For that reason, the dihedral group D n is also commonly (and 
confusingly) denoted by D 2n . 

Of course, one may vary the shape being fixed, as well as the degree to which the 
mappings preserve the ambient geometry, or change the ambient geometry itself. As 
an extreme example, given a set S, the set of all bijections o : S —*■ S, together with 
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composition of functions, is easily seen to be a group. This is the group of symmetries 
of the set S, clearly essentially determined only by the cardinality of S. This group 
is denoted by Sym(S'), or, when S = {1, 2, 3, ... , n}, by S n . 

Following the same general idea, the following is an important example in the 
theory of topological groups. 

Theorem 6.2 For n > 1 let GL„ (R) be the set of all invertible linear operators 
A : R" — y R". With the operation of composition of functions, GL„(R) is a group. 

Proof The straightforward verification of the axioms, using established results from 
Chap. 2, is left for the reader. □ 

The group GL„ (R) is known as the general linear group. It is the group of linear 
symmetries of the linear space R" . By specifying a basis for R" one may identify 
GL„ (R) with the set of all invertible n x n matrices with real entries. The group 
GL„ (R) is essentially the same as the group of invertible n x n matrices over R with 
the operation of matrix multiplication. Other important groups are defined in terms 
of certain linear symmetries, or, equivalently, as spaces of matrices satisfying various 
conditions. See the solved problems section for this chapter for detailed examples. 

As a matter of fact, the claim above is a special case of a much more general 
source of groups, as we now show. 

Theorem 6.3 Let SB be a Banach space. The subset G of B(.5$) consisting of all 
invertible bounded linear operators on SS, with the operation of composition of 
functions, is a group. 

Proof The implicit claim that the composition of invertible bounded linear operators 
is again an invertible bounded linear operator follows from the fact that 

I|AiA 2 || < llAjHHAall 

for all operators A\, A 2 e B(&), as is easily established. Next, clearly, the identity 
id^ is a member of and it serves as the identity element of the group. The 

fact that composition of functions is associative is trivially verified. The fact that 
the inverse operator of a bounded operator is bounded is Corollay 5.2 of the Open 
Mapping Theorem. □ 

We now turn to the structure preserving mappings between groups. 

Definition 6.2 A function \jr : G — > LI between groups is a group homomorphism 
(or simply a homomorphism ) if 

f(glg2) = 

for all g 1 , g 2 e G. If f is also bijective, then it is called an isomorphism, and then 
the groups G and H are said to be isomorphic, denoted by G = H. 
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Example 6.1 As said above, in the presence of a basis for R" . there is a bijection 
between GL„(R) and GL(«; R), given by mapping an invertible linear operator to 
its representing matrix. In fact, this correspondence is a group isomorphism between 
the two groups. The determinant function det : GL„ (R) — > R, which assigns to a 
linear operator A : R" — >■ R" its determinant (i.e., the determinant of its representing 
matrix in any basis) is a group homomorphism det : GL„(R) — > R*. Here R* is 
the set of non-zero real numbers and the group operation is ordinary multiplication. 
When n — 1 the determinant det : GLi (R) —> R* is a group isomorphism. 

A group homomorphism is only required to preserve the group operations in the 
domain and codomain. It follows immediately though that the rest of the group 
structure is also preserved. For the proof it is helpful to note that the existence of 
inverses in a group immediately implies the left and right cancelation laws: in a group 
G, if gh — gh! , then h = h' , and, similarly, if hg = h' g, then h = h' . 

Proposition 6.2 Let f : G —> // be a group homomorphism. Then f (e) = e and 
fi g ~ l ) = i HgT X Jorall g eG. 

Proof We only establish that f (e) = e (notice that the e on the left is the identity in 
G while on the right it is the identity in //, and thus they may be different). Notice 
that fie)fie) — fiee) = fie) = ef(e). The cancelation law now implies that 
f(e) — e. The verification that fig~ { ) — fig)~ l is now trivial. □ 

The verification of the following properties of homomorphisms and isomorphisms 
is very similar to the analogous results established for linear operators, for instance 
in Proposition 2.8. We thus omit the details of the proof. 

ip 

Proposition 6.3 Let G\ GA GR be group homomorphisms. Then 

1. The composition f o ip is a homomorphism. 

2. If both tp and f are isomorphisms, then so is f o cp. 

3. Iff is an isomorphism, then so is f~^ . 

4. The identity function idg : G — >■ G is a group isomorphism. 


6.2 Topological Groups and Homomorphism 

A topological group is a group together with a topology on it which is required 
to be compatible with the group structure, in the sense that the group operations 
are to be continuous. Some of the far-reaching consequences resulting from this 
fusion between algebra and topology will be explored in the subsequent sections. In 
particular, it will be shown that the topology must be sufficiently metric-like so as to 
render such concepts as uniform continuity and Cauchy sequences meaningful. For 
now we focus on the definition and examples. 
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Definition 6.3 A topological group is a group G together with a topology on the set 
G such that the group operation ■ : G x G — > G is continuous (with respect to the 
given topology on the codomain and the product topology on the domain), and such 
that the function G —> G given by g m* g~ l is continuous. 

Example 6.2 The following examples of topological groups are all subsets of R. or C, 
and we consider each of them with the subspace topology induced, in each case, by 
the metric d(x , y) = \x — y\. 

1. The circle S 1 ={zeC||z| = l} with the operation of multiplication. 

2. The set M of all real numbers with the operation of addition. 

3. The set R* of all non-zero real numbers with the operation of multiplication. 

4. The set C of all complex numbers with the operation of addition. 

5. The set C* of all non-zero complex numbers with the operation of multiplication. 

A large family of examples of topological groups are obtained by the following result. 


Lemma 6.1 Let V be a normed space. Then ( V , +, 0), i.e., the additive structure of 
the space, is a topological group. 

Proof We already noted that the additive part of a linear space is an abelian group. 
The continuity requirements are satisfied by Proposition 5.1. □ 

Another significant source of topological groups arises from groups of bounded linear 
operators on a Banach space. 

Theorem 6.4 Let 38 be a Banach space. The subset G of B(38) consisting of the 
invertible bounded linear operators on 38 is a topological group when endowed with 
the operator norm topology. 

Proof Having already observed that G is a group under composition of functions, 
it remains to verify the continuity of the composition and the inverse mapping with 
respect to the operator norm. The continuity of the composition follows at once from 
the inequality 

l|AtA 2 || < HAillHAall 

while the continuity of the inverse mapping is the statement of Exercise 5.30. □ 

Corollary 6.1 The general linear group GL„ (M), when endowed with the operator 
norm topology, is a topological group (which is not abelian if n >1). 

Proof GL„(R) is the group of invertible operators on the Banach space R" . □ 

Since a topological group is a fusion between a group and a topological space, so 
are homomorphisms of topological groups a fusion between group homomorphisms 
and continuous mappings. 
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Definition 6.4 A function \{r : G H between topological groups is said to be a 
homomorphism if xjr is both continuous and a group homomorphism. We say that x[r 
is an isomorphism if \[r is both a group isomorphism and a homeomorphism. If an 
isomorphism between topological groups G and H exists, then the groups are said 
to be isomorphic, denoted by G = H. 

Example 6.3 In a topological group G, the mapping g i->- g ~ 1 is a homeomorphism 
but not usually a group homomorphism. Indeed, it is a group homomorphism if G is 
abelian, since then 


(gh) 1 = g l h 1 

while in general one only has the equality 

(ghr 1 = h~ 1 g~ 1 . 

For any flexed element a e G, the left translation mapping g a g and the right 
translation mapping g i->- ga are both homeomorphisms, but are generally not group 
homomorphisms. The determinant function det : GL„ (R) — > R* is continuous 
(since the determinant is polynomial in the entries of the matrix) and, as we have 
seen, is a group homomorphism, and thus is a homomorphism of topological groups, 
det : GLi(M) —> R* is an isomorphism. 

The reader is invited to state and prove the analogous result of Proposition 6.3 for 
topological homomorphisms and isomorphisms. 


6.3 Topological Subgroups 

A topological subgroup of a topological group is a subset which, on its own right, is 
a topological group with the induced algebraic and topological structures from the 
ambient topological group. In other words, a topological subgroup is a fusion of the 
concepts subgroup and topological subspace. Fortunately, it turns out that the sub- 
space topology on a subgroup is automatically compatible with the group structure, 
and thus it is only the algebraic structure that dictates what the topological sub- 
groups are. Consequently, there is no difference between the topological subgroups 
of a topological group and its subgroups when the topology is forgotten. 

In light of the above, we focus on the notion of subgroup, and, anticipating the 
contents of the next section, already discuss normal subgroups. We then observe 
the first non-trivial consequence of the interaction of the algebra and the topology, 
namely that the closure of a subgroup is a subgroup. 

Definition 6.5 A subset H of a group G is a subgroup of G if the group operation in 
G restricts to a group operation on H. Equivalently, H is a subgroup of G if e e H 
and for all h \. hi e H 
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h\h^ X G H . 

The motivation for the following definition is presented in the next section. 

Definition 6.6 A subgroup H of a group G is said to be normal if ghg~ l G H for 
all h G H and g G G. The element ghg~ l is called the conjugate of h by g. Thus, a 
subgroup is normal when it is closed under conjugation. 

Remark 6.3 Note that the condition for normality of H in G is equivalent to: for all 
g g G and h e H, there exists h' e G such that 

hg = gh! . 

Thus, the condition of normality can be seen as a weak form of commutativity; 
elements in G commute with elements in H. up to a replacement by another element 
in H . In particular, if G is abelian, then every subgroup H of G is normal. 

Example 6.4 Given a group homomorphism \[r : G — > H, its kernel is the set 
Ker(i fr) = {g g G \ f(g) = ej. The kernel of a homomorphism is easily seen 
to always be a normal subgroup of G. One may also easily verify (compare with 
Theorem 2.8) that Ker(i//) = {e}, i.e., the kernel is the trivial subgroup of G, if, and 
only if, i fr is injective. 

Regarding subgroups of a topological group G, the presence of a topology entails 
no complications, at least in the following sense. 

Lemma 6.2 Let G be a topological group and H a subgroup of G. When endowed 
with the subspace topology, H is a topological group. 

The proof is immediate, relying on the simple fact that the restriction of a continuous 
mapping to a subspace remains continuous. 

We may now demonstrate one consequence of the fruitful interaction between the 
group structure and the topology. 

Theorem 6.5 If H is a ( normal ) subgroup of a topological group G, then H, the 
closure of H, is also a (normal) subgroup of G. 

Proof To show that //is a subgroup, we need to show that gh~' G // for all g. h g H. 
In other words, for the function / :GxG^-G given by f(g, h) — gh~ l , we need 
to establish that f(H x H) C H. Since / is clearly continuous, and H is closed, 
it follows that f~ l (H) is closed. Next, H x H C f~ x (H) C /“*(//), where the 
first inclusion follows from the fact that H is a subgroup and the second inclusion 
since H C H. Taking closures leads to H x H C f~ l (H) = f~ l (H). The result 
follows by noting that H x H = H x H (this latter fact is a general property of 
closures that has nothing to do with the group structure on H). We leave it to the 
reader to verify that if H is a normal subgroup, then so is H. □ 
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6.4 Quotient Groups 

Whereas the notion of subgroup of a topological group was quite straightforward, 
with a seamless interaction between the algebraic and the topological demands, the 
situation for the quotient of a topological group by a subgroup is more intricate. The 
main difficulty lies with the algebraic structure, namely that for an arbitrary subgroup 
//, it is not always possible to obtain a group structure on the relevant quotient set. 
Topologically however, there are no difficulties since the quotient topology is always 
available. The details are given below. 

Given a group G and a subgroup H, defining g i ~ g2 precisely when gf l g2 € H 
is easily seen to be an equivalence relation on G. The equivalence class of g e G is 
easily seen to be the set 


gH = { gh | h e H} 

and is called the left coset of g. In particular, two left cosets g i H and gjH are equal 
if, and only if, gf 1 g2 e H. Similarly, defining g i ~ g2 precisely when gigf i e H 
gives rise to an equivalence relation whose equivalence classes are all of the form 

Hg = {hg | h e H} 

and are called right cosets (of H in G). Of course, if G is abelian, then the left and 
right cosets coincide, but in general they may be different. However, any general 
result concerning left cosets in a group has a corresponding dual result about right 
cosets. In fact the translation between results on left and right cosets is achieved by 
means of the correspondence gH ** Hg~ l ■ For this reason, we may safely only 
concentrate on either left or right cosets, and we choose the left ones. 

Thus, any choice of a subgroup H of G gives rise to an equivalence relation ~ 
with corresponding quotient set 

G/H = {gH | g e G}, 

the set of all left cosets of H in G, with the corresponding canonical projection 
tv : G — > G/H given by nig) — gH . We now address the question whether the 
group structure on G induces a group structure on G/H. 

Theorem 6.6 Let H be a subgroup of a group G. If the operation on G/H given by 
( giH ) • (g2H) = (gig2)H is well-defined, then it defines a group structure on G/H 
and 7 t : G — > G/H is a group homomorphism. 

Proof The verification of the group axioms is trivial. For instance, noticing that the 
coset eH is precisely the set H, we show that H is an identity element for the given 
operation. Let gH e G/H be an arbitrary element. Then 


HgH = eHgH = (eg)H = gH 
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and similarly gH ■ H — gH . We leave the verification of the other group axioms to 
the reader. The claim that jt is a group homomorphism is nothing but the observation 
that JT(gig 2 ) = {g\gi)H = g\H ■ g 2 H = Jt(gi)7t(g 2 ). □ 

We thus see that as soon as the operation ( g\H ) • ( g 2 H ) = {gigi)H is well-defined, 
the quotient set G/H acquires a group structure. For the operation to be well-defined, 
one needs to verify that if g\H = g 2 H and g 2 H — g^H , then necessarily g\g 2 H = 
gjgilf . In other words, given that g~[ l g3 € H and that gf l g4 e H, it must follow 
that (gig 2 )~ l g3g4 e H. Let us denote gf l g3 = h\ and g 2 l g4 — h 2 . Noting that 

(gig2)~ 1 g3g4 = g 2 1 gi 1 gig4 = g 2 l hig4, 


if G were abelian, then we would obtain that 


(gigi) 1 g3g4 = g 2 l g4hi = h 2 h\ e H, 


as desired. However, G need not be abelian. A closer look though reveals that a 
commutativity demand is far too strong. Indeed, if H is a normal subgroup of G, 
then for g 4 and h\ as above, there exists h' e H such that h\g 4 = g 4 h', and then 

(gig 2 )~ l g3g4 = g 2 l h\g4 = g 2 l g4h' = h\h' e H. 


We summarize this discussion as follows, remarking first that for a normal subgroup 
H in G, it follows at once that the left and right cosets of H in G coincide, and thus 
one may simply speak of the cosets of H in G. 

Theorem 6.7 If H is a normal subgroup of G, then the quotient set G/H of all 
cosets of H in G is a group when endowed with the operation 

(giH) ■ ( g 2 H ) = (, g\g 2 )H . 

The canonical projection jt : G — > G/H is then a group homomorphism. The group 
G/H is called the quotient group of G by H. 

Corollary 6.2 A subgroup H of a group G is normal if and only if, H is the kernel 
of some group homomorphism whose domain is G. 

Proof It was already noted that the verification that the kernel of a homomorphism 
is a normal subgroup is straightforward. For the converse, show that 

Ker(jr) = H 

for the canonical projection jt : G -> G/H. □ 

The situation now for a topological group G and a subgroup H of it is summarized 
in the following result whose proof is left for the reader. 
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Theorem 6.8 Let G be a topological group and H a subgroup. 

1. The set {gH \ g e G} of all left cosets of H in G is the quotient set for the 
equivalence relation g i ~ gi 4=A gf g2 € H. When endowed with the 
quotient topology it is a topological space but it need not be a group. 

2. The set {Hg \ g e G} of all right cosets of H in G is the quotient set for the 
equivalence relation gi ~ g2 4=A gt gf ' e //• When endowed with the 
quotient topology it is a topological space but it need not be a group. 

3. If H is normal in G, then left and right cosets coincide, as do the two quotient 
sets above, which are then denoted by G/H. The quotient set G/ H acquires a 
group structure from G and with the quotient topology it is a topological group. 


6.5 Uniformities 

In the presence of a topology notions such as convergence and continuity become 
available. However, a topology is too weak to speak of Cauchy sequences (and thus 
of completeness) or of uniform continuity, notions that do exist in the presence of a 
metric. However, there exists a structure, known as a uniformity, which is in between 
a topology and a metric, and which does allow one to speak of Cauchy sequences, 
completeness, completions, and uniform continuity. Moreover, every topological 
group is automatically endowed with two (often distinct) such uniformities, as we 
show below. 

We first define the concept of uniformity in general and then proceed to describe 
the canonical uniformities present in any topological group. As motivation, recall 
that the axiomatization of a topology can be seen as the result of purifying a distance 
function, eliminating any trace of details irrelevant for continuity. Similarly, the 
axiomatization of a uniformity can be seen to arise from a similar purification of a 
distance function from anything irrelevant to uniform continuity. 

Definition 6.7 Let X be a set. A non-empty collection S — {£,},<=/ of relations 
Ej C X x X is said to be a uniform structure or a uniformity on X if the following 
conditions hold (where we write x y to indicate that (x, y) e E). 

1. x x for all E e S and x e X. 

2. For all E e S there exists an F e S such that x y implies y x. 

3. For all E e S there exists an F e S such that .r ~/r y and y z together 

imply that x y. 

4. If E, F e S', then so is E n F. 

5. If E c F c A x A and E e S, then F e 

The pair (X, S') is then called a uniform space. The members of S are knows as 
entourages. If context clarifies any ambiguity, it is common to refer to a uniform 
space X, leaving the collection S of entourages implicit. 
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Example 6.5 Given a metric space (X, d), a uniformity on the set X is induced by 
the distance function as follows. For every e > 0 consider the set 

E s = {(x, y) e X x X | d(x, y ) < e}. 

It then follows that the collection 

g = {E C X x X | 3e > 0 E D E e ] 

is a uniformity. The verification of the axioms is straightforward. For instance, con- 
dition 3 follows from the triangle inequality: given E 3 E e , let F = E e / 2 , then 
if x y and y z, namely d{.x, y),d(y, z ) < s/2, then dix, z) < e, so that 

x z. The resulting uniformity is called the induced uniformity. 

Definition 6.8 A function / : X — > Y between uniform spaces is said to be uni- 
formly continuous if f~ l (E) is an entourage in X for every entourage E in Y (here 
f~ l (E) stands for {(x, x') elxl /(x) fix')}). 

Recall that a topology captures the notion of continuity in the sense that a function 
f : X Y between metric spaces is continuous in the metric sense if, and only if, / 
is continuous with respect to the induced topologies. We now establish the analogous 
result for uniformities and uniformly continuous functions. 

Theorem 6.9 A function f : X — > Y between metric spaces is uniformly continuous 
in the metric sense if and only if f is uniformly continuous with respect to the induced 
uniformities. 

Proof Suppose that / is uniformly continuous in the sense that for every entourage 
F in Y , f~ l (F ) is an entourage in X. Given s > 0 consider the entourage 

F e = (Oh /) e Y x Y | d(y, y') < e}. 

By definition of the induced uniformity, f~ 1 (F £ ) contains an entourage of the form 
E$ — {(x, x') e XxX | d(x, x') < <5}, for some <5 > 0. In particular, if d(x, x') < 8, 
then(x,x') e Eg and thus (fix), fix')) e F e , namely d(f(x), fix')) < e. In other 
words, / is uniformly continuous with respect to the metrics. 

In the other direction, suppose that / is uniformly continuous in the sense that 
for all s > 0 there exists a corresponding 8 > 0 such that difix), fix')) < e 
provided that dix, x') < 8. To show that / is uniformly continuous with respect to 
the induced uniformities, let F be an entourage in Y , namely there exists an e > 0 
such that /•' 2) F, where 

F e = {(J, /) e Y x Y | diy, y') < e}. 

With 8 > 0 corresponding to s, we claim that / 1 (F) 3 Eg, where 


Eg = {(x, x') e X x X | dix, x') < <5}. 



6.5 Uniformities 


211 


Indeed, if (x, x ') e E$, then d(x, x') < 8 and thus d(f(x), fix')) < s. It follows 
that(x.x') e / _1 (F S ), and thus E& c f~ l (F s ) c / _1 (F), This shows that 
contains an entourage in X, and thus is itself an entourage, which is what we needed 
to show. □ 

Clearly, the distance function can not be reconstructed from the induced uniformity. 
It is a simple matter to find different metrics that induce the same uniformity (for 
instance, by scaling all distances by a positive constant). A uniform space ( X , S) 
whose uniform structure S is the induced uniformity for some distance function 
d on X is said to be a metrizable uniform space. Thus, so far, we established that 
the concept uniform space is weaker than that of metric space, and that it correctly 
captures the notion of uniformly continuous functions. Next, we show that every 
uniform space induces a topology. 

Theorem 6.10 Let (X, S) be a uniform space. The collection r of sets U C X such 
that for all x e U there exists an entourage E € S such that {y e X \ x y } C JJ 

is a topology on U, called the induced topology. 

Proof The proof is left as an exercise for the reader. □ 

The passage from a distance function to the induced uniformity loses information 
and is thus not a reversible process. Similarly the passage from a uniformity to the 
induced topology is not reversible. A topological space (X, r) whose topology r is 
the induced topology for some uniformity on X is called a uniformizable topological 
space. 

We are now ready to describe the two uniformities present on any topological 
group. Recall that every element a e G gives rise to the left translation map g i— ag 
and to the right translation map g ga, both of which are homeomorphisms. We 
extend the translation mappings so that they also operate on subsets S C G, that is 
we define aS — {as | s e S}, and similarly Sa = {sa | s e 5}. 

Theorem 6.11 Let G be a topological group and SB a local basis at e. For every 
open set U e SB let Ru — {(x, y) e X x X \ xy -1 e U). The collection 

S R = {E c A x X | 3C7 e SB £3%) 

is a uniformity on X called the induced right uniformity. Similarly, for each U e SB 
let Lu = {(x, y) e X x X | x -1 y e U). The collection 

S L = {E c X x A | BU e SB E^_L V } 

is a uniformity on X called the induced left uniformity. 

Proof This is a routine verification of the axioms of a uniform space, □ 

The following properties of the induced uniformities justify the names given to them. 

Theorem 6.12 Let G be a topological group and let S R and Sr be the left and right 
induced uniformities. Then 
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1. The induced topology by either Sr or Sr is the original topology on G. 

2. Every left translation g i — > ag is uniformly continuous with respect to Sr. 

3. Every right translation g i— >- ga is uniformly continuous with respect to Sr. 

Proof The proof is straightforward. □ 

Remark 6.4 We conclude the book with the following remark. A sequence {.x m } m >\ 
in a uniform space X is said to be a Cauchy sequence if for every entourage E there 
exists anJVeN such that Xk x„ for all k, n > N. A uniform space X is said to 
be complete if every Cauchy sequence in X converges (under the topology induced 
by the uniformity). The process of completion of a metric space (as given in Chap. 4 
by means of Cauchy sequences) extends to a completion process for every uniform 
space (however not by means of Cauchy sequences, as they are too weak for general 
uniform spaces). Similarly, other uniform concepts of metric spaces have analogous 
results valid for all uniform spaces. 

The result above shows that the topology of a topological group is always uni- 
formizable (and the uniformity can be chosen to further be compatible with all right 
or all left translations). Thus, in a sense, the topology of topological groups is rather 
tame, behaving more like metric spaces do than the more wild topological spaces 
out there do. 

Exercises 

Exercise 6.1 Let G be a group and Aut(G) the set of all isomorphisms f : G -+ G. 
Prove that, under composition of functions, Aut(G) is a group. (This is the group of 
symmetries of the group G). 

Exercise 6.2 Given a group G and an element g e G, prove that h i->- ghg~ l is an 
isomorphism G —*■ G. Such an isomorphism is called an inner isomorphism. Prove 
that the set of all inner isomorphisms of G is a normal subgroup of Aut(G). 

Exercise 6.3 Prove that if H is a normal subgroup of G, then gH = fig for all 
g e G. Use a suitable subgroup H of the dihedral group />, , the group of symmetries 
of an equilateral triangle, to refute the converse. 

Exercise 6.4 Prove that if G is a topological group and if, for some g e G, the set 
{g} is closed, then G is Hausdorff. 

Exercise 6.5 Prove that for all topological groups G, the quotient group G/{e } exists 
and is a Hausdorff topological group. 

Exercise 6.6 Prove that an open subgroup H of a topological group G is also closed. 
(Hint: The complement of H is a union of translates of II). 

Exercise 6.7 Prove that the topology induced by the uniformity induced by a metric 
is identical to the topology induced by the metric. 

Exercise 6.8 For subsets S and T of a topological group G, consider the point-wise 
product ST = {st \ s e S, t e T\. Prove that if S and T are compact, then so is ST. 
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Exercise 6.9 Prove Theorems 6.10, 6.11, and 6.12. 

Exercise 6.10 The following is a well-known theorem in the theory of uniform 
spaces: A uniform space (A, g) is metrizable if, and only if, the uniformity g is 
generated by countably many entourages, where g is generated by the collection 
& C gif 


g = {E c A x A | 3F e & E 2 F}. 

Prove that a first countable Hausdorff topological group is metrizable. 

Further Reading 

For a broad physics-oriented introduction to groups see [4] or, for an introduction 
to groups in the context of Quantum Mechanics, see [5]. For a friendly introduction to 
topological groups see Chap. 3 of [1] (and the rest of the book for a delightful deeper 
study of the subject). For an elementary introduction to uniform spaces alongside 
topological spaces see [2] . Finally, for a comprehensive introduction to topological 
groups, starting with an independent treatment of topological spaces and metric 
spaces, see [3], 
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Chapter 7 

Solved Problems 


7.1 Linear Spaces 

7.1 Consider the linear space C (R, R) and the vectors x(t ) = sin(f), y(t) — cos(f), 
and z(t) = 1. Show that x, y, z are linearly independent, while x , y 2 , z 2 are not. 

7.2 Let V be a linear space and S C y a set of vectors. Prove that S is linearly 
independent if, and only if, every finite subset of S is linearly independent. 

7.3 Consider Rasa linear space over itself. Prove that R has precisely two linear 
subspaces. Now consider Rasa linear space over Q and prove that it has infinitely 
many linear subspaces. 

7.4 Let V and W be linear spaces over the field K, and X C V an arbitrary subset 
of V . A linear relation in X is any expression 


m 



k= 1 


where m > 0, aj , . . . , a m e K, and x \ , . . . , x m e X. Let S be the span of X and 
consider an arbitrary function f \ X —*■ W . Prove that / extends to a linear operator 
F : S —> W if, and only if, for any linear relation in X as above, one has 


m 


o = y'^kfixk). 


k=l 
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Conclude that if £$ is a basis for V, then any function / : f!$ — » W extends to a 
linear operator F : V — * W. 

7.5 Let T : V — »■ W be a linear isomorphism between linear spaces. Prove that T 
maps a basis for V to a basis for W. 

7.6 Prove that every infinite dimensional linear space over R contains an isomorphic 
copy of R", for all n > 0. Is there a linear space that contains an isomorphic copy of 
every linear space over R? 

7.7 Prove that the space M of 3 x 3 matrices with entries in R is spanned by the 
subspace of matrices of rank 2. 

7.8 Compute the dimension of the linear space P„,n > 0, of polynomials with real 
coefficients and degree at most n, and of the linear space P of all polynomials with 
real coefficients. 

7.9 Let A, B : V — > V be two invertible linear operators from a linear space to 
itself. Prove that if A and B commute, i.e., AB = BA, then so do A~ l and B~ l . 

7.10 Prove that a linear space V over an arbitrary field K (if you like you can take 
K = R or K = C) is infinite dimensional if, and only if, V is isomorphic to a proper 
linear subspace of itself. 


7.2 Topological Spaces 

7.11 (Separation properties) A topological space X is said to satisfy: 

• the To separation axiom if for all distinct points x, y e X there exists an open set 
U such that either x e U and y £ U, or y e U and x £ U\ 

• the T\ separation axiom if for all distinct points x, y e X there exist two open sets 
U and V such that x e U and y £ U, and y e V and x g V ; 

• the 72 separation axiom if X satisfies the Hausdorff separations property, that is 
for all distinct points x , y e X there exist two open sets U and V such that x e U , 
y e V, and U n V = 0. 

Clearly, the separation properties are increasing in strength. For every separation 
property, give an example of a topological space satisfying it, but not the next one. 

7.12 (Locally connected spaces) A topological space X is called locally (path) 
connected at x if, for every open set V with x e V, there exists a (path) connected 
open set U with x e U C V. The space X is said to be locally (path) connected if it 
is locally (path) connected at every x e X. 

Show that a locally (path) connected space need not be (path) connected, nor does 
a (path) connected space need be locally (path) connected. For the latter, consider 
the comb space, the subset C of R 2 given by 
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c = {(x, 0) | 0 < x < 1} U {(0, y) I 0 < y < 1} U Aj U A 2 U ■ ■ • U A n U ■ • • 


where A n = {(1/n, t) | 0 < t < 1}. The set C is given the subspace topology of the 
Euclidean topology on R 2 . (Hint: Draw a picture of this space!) 

7.13 (Locally compact spaces) A topological space X is called locally compact at 
xelif there exists a compact set C and an open set U such that x e U c C. The 
space X is locally compact when it is locally compact at every x e X. 

Show that every compact space is locally compact, but not every locally compact 
space is compact. Provide an example of a space that is not locally compact. 

7.14 Consider 

X = | A e M 2 (M) | A 2 = oj . 

Endow X with the subspace topology of M 2 (R), where M 2 (R) is endowed with the 
topology induced by the identification : M 2 (M) -A M 4 given by 

a b\ 0 , , 

c, cl), 

and M 4 is endowed with the Euclidean topology. Decide whether X is closed in 
M 2 (M), whether X is compact, and whether X is connected. 

7.15 Consider the topological spaces 

Xi = | (x, 1) e R 2 | x e rJ and X 2 = j(x, 2) e R 2 | x e rJ 

endowed with the subspace topology from the Euclidean topology on R 2 . On the 
space 

X = Xi UI 2 , 


the coproduct of Xi and X 2 , consider the equivalence relation 

(x, 1) ~ (x , 2) x = x ^ 0. 

Let X = X/ ~ be the quotient space. Decide whether: 

1. X is 7(), T u T 2 -, 

2. X is compact; 

3. X is path-connected. 
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7.16 Consider the spaces 


X, = 


{(x, y, Z ) e R 3 

I x 2 + y 2 + - 2 = : 

{(yi,y 2 ) GR 2 1 

yi+yi = i}. 

{(yi,y 2 ) GR 2 | 

y 2 + y 2 + 2yi = 


J {(yt> yi) G R 2 | yj + yj-2yi = 

For every pair of spaces, decide whether the spaces are homeomorphic or not. 


7.17 Construct a topological space where the closed sets are stable under countable 
unions, but not under all arbitrary unions. 


7.18 Let A e M„ (R) be a symmetric matrix whose eigenvalues are positive. Decide 
whether 


IF = {x e R" | x‘Ax = 1} 
is a compact subset of R" with the Euclidean topology. 

7.19 Consider a topological Hausdorff space X and a subset D C X which is dense 
and locally compact. Is D necessarily open in XI 

7.20 Consider the ring Z of integer numbers. The spectrum of Z is the set 

X = { p ■ Z | p prime or p = 0} , 

where p-Z = {p-k\k& Z}. For all a e Z define the set 

V (a ■ Z) = {p ■ Z e X \ a ■ Z c p ■ Z} . 

• Prove that the collection V = { V {a ■ Z) | a e Z} is the set of closed subsets of 

a topology on X. That topology is known as the Zarisky topology on Z. 

• Does there exist a single point in X whose closure is the whole space? 

• Is X Hausdorff? 


7.3 Metric Spaces 

7.21 Let (X, d ) be a metric space. Suppose that there exists an infinite countable set 
{-LiineN of points such that d(xk , x m ) = 1 for all k, m e N with k ^ m. Prove that 
X is not compact. 
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7.22 Consider the set l 0 0 of all bounded sequences of real numbers with the metric 
induced by the l 0 Q norm, i.e., 


d{x, y) = y \x n - y n \. 

n 

Prove that l a 0 is connected and not compact. 

7.23 Let / : R — ► R be an infinitely differentiable function with the property that 
for all ioeK there exists a natural number n > 0 with f* n> (xq) = 0. Prove that there 
exists an open interval I = (a, b), with a < b, such that / agrees with a polynomial 
function on I . 

7.24 Consider the space C([0, 1], R) of all continuous real-valued functions on the 
interval [0, 1], endowed with the L 0 0 norm 

ll/lloo = max 1/(01 ■ 
t€[0,l] 

Is this norm induced by any inner product on C([0, 1], M) ? 

7.25 Prove that the empty set and every singleton set admit a unique metric, while 
every other set admits infinitely many non-isometric metric structures. 

7.26 Let ( X , d ) be a metric space. Consider a subset S of X. Prove that the closure 
S of S in X is the set 


{x e X | d(x, S) = 0} , 
where d(x, S) = inf^s d(x, s). 

7.27 Let ( X , d) be a metric space. Prove that the function d : X x X — »■ R defined 
by d(x, v) = minjl, d (x , y)| is a metric on X which induces the same topology as 
d does. 

7.28 Consider the spaces R and M 2 endowed with the Euclidean metrics. Decide 
whether there exists an isometry / : R 2 — > R. 

7.29 Let {di\i € [ be a family of distance functions on a given set X. Prove that the 
supremum d s : X x X —> R+ given by 

d s (x, y) = sup {di(x, y)} 
iel 

is a distance function on X. In contrast, show that the similarly defined infimum of 
distance functions need not be a distance function. 
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7.30 Let f : X —*■ Y be a function between metric spaces. 

1 . Is continuity, or uniform continuity, a sufficient condition to ensure that if X is 
complete then f(X) is complete? 

2. Is continuity, or uniform continuity, a sufficient conditon to ensure that if X is 
totally bounded then f(X) is totally bounded? 

3. Is continuity, or uniform continuity, a sufficient condition to ensure that if X is 
complete and totally bounded then f(X) is complete and totally bounded? 


7.4 Normed Spaces and Banach Spaces 

7.31 A metric function dona linear space V, with ground field K — R. or K = C, 
is said to be translation invariant when 

d(x, y ) = d(x + z, y + z) 

for all x, y, z e V and is said to be scale homogenous when 

d(ax, ay ) = \a\d{x, y) 

for all x, y e V and a e K. Prove that V is a normed space if, and only if, it is 
endowed with a translation invariant and scale homogenous metric. 

7.32 Let X = C([a, b], R) be the space of continuous functions / : [a, b] -* R 
and consider the assignment 


b 

ll/ll = J \f(x)\dx 

a 

for all f e X. 

1. Give a direct proof that / i->- ||/|| is a norm on X. 

2. Decide whether X with the induced metric di f, g) = \\f — g|| is a complete 
metric space. 

7.33 Consider R with the function d : R x R -* R+ given by 


d (x, y) 


lx-yl 
l + \x - y\ 


for all x, y e R. Decide whether d is a distance function on R and if so whether the 
metric is induced by a norm. 

7.34 Let A be a linear space endowed with two norms, ||-||j and ||-|| 2 . Suppose that 
there exists K > 0 such that 
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II v|| 1 < K ■ || v|| 2 

holds for all vel Prove that the topology induced by || ■ || 2 is finer than the topology 
induced by ||-||j. 

7.35 For vectors x, y in a linear space V, let L(x, y) be the affine line connecting x 
and y, that is 

L(x, y ) = {ax + (1 — a)y | 0 < a < 1}. 

A normed space V is said to be strictly convex if for all distinct vectors x, y on the 
unit sphere, i.e., ||x|| = ||y|| = 1, the line L(x, y) intersects the unit sphere only at 
x and y. 

1. Prove that for x, y with ||x|| = ||v || = 1 the line L(x, y) is fully contained in the 
unit ball {z e V | ||z|| < 1}. 

2. Prove that a normed space induced by an inner product is strictly convex. 

3. Prove that in a strictly convex normed space, if x ^ y are vectors which satisfy 
|| jc || = ||y|| = 1, then ||x + y|| < 2. 

4. Of the l\, £ 2 , and €«> norms on R 2 , decide which, if any, are strictly convex. 

7.36 Prove that the kernel of a bounded linear operator A : V — > W is a closed 
linear subspace of V. 

7.37 Consider the normed space £ 2 and the linear operator A : £ 2 £2 given by 

1 

^({^/l}«>l) — {(1 — 

n 

Prove that A is a bounded linear operator that never attains its norm, i.e., there exists 
no xo such that 

HAxoll = ||A||||xo||. 

7.38 Let A : U —*■ V and B : V — »■ W be bounded linear operators between normed 
spaces. Prove that the composition BA is a bounded linear operator and that 

|| Z? A || < P||||A||. 

7.39 Let SB be a Banach space and recall that B {/_'/£) . the set of all bounded linear 
operators A : ^ — »■ SB, is a normed space when endowed with the operator norm, 
and thus is a topological space. Prove that the set G C B (SB) consisting of the 
invertible operators is an open subset of B (SB). 

7.40 Let SB \ , SB 2 , SB 3 be Banach spaces with T : SB\ — > SB 2 a bounded linear 
operator (whose domain is all of SB\) and let S : SB 2 — > SBy be a closed linear 
operator whose domain is S>(S) C ^B 2 - Prove that .S' 7' is a closed linear operator. 
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7.5 Topological Groups 

7.41 Prove that the group (R", +) with the Euclidean topology is a Hausdorff 
topological group. In fact, it is a topological vector space, i.e., all of the linear space 
structure mappings are continuous. 

7.42 Prove that the special linear group 

SL„(R) = {A e GL„(R) | det(A) = 1} 

is a topological group. 

7.43 Prove that the orthogonal group 

O(n) = |a e GL„(M) | A -1 = A ( J 

is a topological group. 

7.44 Prove that the special orthogonal group 

SO (n) = [A e O (n) | det(A) = 1} 

is a topological group. 

7.45 Prove that the symplectic group 

Sp 2ll (R) = (AeGL 2 „(R) | A 1 ■ J 0 ■ A = /<,}, 

where 



is a topological group. 

7.46 Prove that the topological group SO (n) is path-connected. 

7.47 Prove that the topological group O(n) is the disjoint union of two path- 
connected components, 0 + (n) and O “(«). 

7.48 Prove that 

GL+(R) = {A e GL„(M) | det(A) > 0} 

is a path-connected component of the topological group GL„ (R). 

7.49 Prove that the topological group GL„ (M) has two connected components, more 
precisely. 


7.5 Topological Groups 


223 


GL+(R) = {A e GL„(M) | det(A) > 0} 

and 

GL“(R) = {A e GL„(R) | det(A) < 0}. 
7.50 Prove that the special linear group 

SL„(R) = {A e GL„(R) | det(A) = 1} 

is path-connected. 


7.6 Solutions 

7.6.1 Linear Spaces 

7.1 Suppose that 


ax + Py + yz = 0 

for some scalars a, /3, y e IR. The equation is a functional equality, and thus for any 
choice of t e R. we have that 

ax(t) + Py(t) + yz(t ) = 0. 

However, by considering the values t = 0, t = jr/2, t = re , we obtain the equations: 
P + y = 0, a + y = 0, and —/l + y = 0, from which a = /3 = y = 0 follows 
easily. Thus the only linear combination resulting in the constantly zero function is 
the trivial combination, showing the desired linear independence. Now, as for x 2 , 
y 2 , and z 2 , noting that x 2 (t) + y 2 (t) = 1 = z 2 (f), we see that x 2 + y 2 — z 2 = 0, 
establishing the desired linear dependence. 

7.2 Given a set S of vectors in V, suppose first that S is linearly independent. We 
must show that every finite subset of it is linearly independent, thus let Sf c S be a 
finite subset of S. Suppose that 


m 

0 = y^^akXk, 

k= 1 


where ai , . . . , a m are scalars and x \ , . . . , x m e Sf. But then the exact same linear 
combination is also a linear combination of elements from the linearly independent 
set S, and thus all scalars must be 0. This shows Sf is linearly independent. In the 
other direction, suppose every finite subset of S is linearly independent. To show 
that S is linearly independent suppose that 0 is obtained as a linear combination 
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(necessarily a finite one!) as above, with x\ . . . , x m e S. Consider then the finite set 

{xi, ...,x m ] c S 

and we thus see that 0 is written as a linear combination of elements from that subset, 
which is linearly independent by assumption. Thus all scalars must be 0, and thus S 
is linearly independent. 

7.3 Consider R as a linear space over itself and let IT be a non-trivial linear subspace 
of R. Thus W contains some non-zero vector, namely a real number x ^ 0. The 
dimension of R over itself is clearly 1, and thus any non-zero vector spans all of M. 
Since W, as a linear subspace, is its own span, we conclude that W — M. Thus, we 
showed that other than the trivial subspace {0}, the only other subspace of M is R 
itself, as required. 

Now we consider R as a linear space over Q, and note that this is an infinite 
dimensional linear space. Let 88 be a Hamel basis for R over Q. For every subset X 
of 88 consider the span of X, and denote it by Wx ■ We claim that W x ^ Wy for all 
subsets X, Y C 88 with X ^ Y. Indeed, seeking a contradiction, suppose Wx = Wy 
and we may assume that X is not a subset of Y. Choose a vector x e X — Y . Since 
x e Wx it follows that x e Wy. Thus 


m 

x = ^ a k y k , 
k= 1 


where a\, , a m e Q and y \ , . . . , y m e Y . But then 

m 

0 = * - y . akyk 

k= 1 


is a non-trivial linear combination of vectors from the linearly independent set 88, 
an impossibility. To conclude then, for each subset of 88 we obtained a unique linear 
subspace of R. There are thus infinitely many subspaces, as required. In fact, the 
cardinality of a Hamel basis in this case was shown to be equal to |R| , the cardinality 
of the real numbers. We thus showed that R as a linear space over Q has at least 
\88(8S)\ many linear subspaces, a cardinality known to be strictly larger than the 
cardinality of 88, and thus of R. 

7.4 Suppose first that the given function / : X —*■ W extends to a linear operator 
F : S — ► W. If 


m 

o = y akXk, 

k= 1 


is a linear relation in X, then 
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m m m 

0 = F(0) = F(^a k x k ) = y^a k F(x k ) = ^a k f(x k ), 
k= 1 k= 1 fc=l 

as claimed. In the other direction, suppose the condition on linear relations in X is 
met. Given x e S, write 


y = X 01 x ' x 

as a (finite!) linear combination of vectors from X. If 

F(y) = £«* ■ /(*) 

is well-defined, then it clearly gives rise to the desired linear extension of /. So, 
suppose that 

y = X & ' x 

is another expression of y as a linear combination of vectors from X. But then, by 
subtracting the two expressions, we obtain 

0 = - &) • x 


which is a linear relation in X. It thus follows that 

0 = Yt<*x - Px) ■ fix) 


from which it follows that the formula for computing F(y) is independent of the 
presentation of y as a linear combination of vectors from X, as desired. 

As for the special case where X = SB is a basis, note that the span of X is then 
the entire ambient space V, and that the linear independence of SB implies that there 
are no linear relations in X, so the needed condition for the extension is vacuously 
satisfied. 

7.5 Suppose that T : V — > W is an isomorphism and that is a basis for V. We 
will show that T (3B) = { Tih) \ b e 3d) is a basis for W. Given an arbitrary vv e IV, 
consider T~ l {w ) e V. We may write T~ l (w) as a linear combination 

T-\w) = Y j a h -b 

and thus 

w=7’(7'- 1 (w)) = ^a fc -r(fo) 
showing that T {SB) spans W . Next, suppose a linear combination 


226 


7 Solved Problems 


o = Y.«»-r (b) 


is given. But 


Y,<*b-T(b) = T(Y, oib ■ b) 


and thus ^ab ■ b e Ker(T) = {0} (since T is injective). It follows that 

^ a.b ■ b — 0 


and therefore that all of the coefficients are 0, as required. 

7.6 Suppose V is an infinite dimensional linear space and let 88 be a basis for V. 
Choose a countable subset {yi, V 2 , ■ ■ ■} 1= 88 and let W n be the span of { y i , .... y n } . 
It is immediate to verify that T : M" — »■ V given by 


n 



k=\ 


is a linear isomorphism identifying a copy of M" inside V, as claimed. 

As for the existence of a linear space U which contains an isomorphic copy of 
every linear space over M, the fact that no such linear space exists is a consequence 
of a set-theoretic result known as Cantor’s Theorem. Suppose that such a U does 
exist. By constructing free linear spaces one sees that linear spaces of arbitrarily 
large cardinality exist. That is, given any set X, no matter how large, there always 
exists a linear space having X as a basis. Since U is assumed to contain a copy of 
every linear space, it follows that X is in bijection with a subset of U. In other words, 
there exists an injection X — > U, and thus X < |U|, for all sets X. 

In particular, for the set X = f8(\J) of all subsets of U one has that 


\&m < iui . 


However, the function u i->- {n} is clearly an injection U -» .XTU), and thus 

iui < mu)|. 

We thus conclude, by the Cantor-Shroder-Bernstein Theorem (see Preliminaries if 
needed), that 

|U| = |^(U)|. 

However, this is known to be impossible by an argument we now present. 

Theorem 7.1 (Cantor’s Theorem) For all sets S there exists no surjective function 
S — > &{S). In particular |S| < \8^(S)\. 
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Proof Suppose a surjective function / : S -> &(S) exists. Consider then the set 

S\ = {s e S | si f(s)} 

which is clearly a subset of S, and thus S\ e P/(S). Since / is surjective, it follows 
that there exists s\ e S with f(s\) = S< . To obtain a contradiction let us consider 
whether s\ e S\ or not. If s\ e S\, then s\ f f(s \ ) = St, which is absurd. However, 
if s\ f S) = then s\ fulfills the condition for being an element in S\, and thus 

s\ e S\, again an absurdity. We must conclude that no such / exists. 

Since no surjection from S to P/iS) exists, no bijection exists either, and thus the 
sets have different cardinalities. Since s i-> {.v} is clearly an injection S — ► PSiS) it 
follows that 1 5 1 < \PP(S) | . □ 

7.7 Let E a b be the 3x3 matrix whose entries are all 0 except for the (a, b) entry 
being 1. Clearly, the set {E a b} a ,be{i, 2 , 3 } spans M. We have to find a basis composed 
of rank 2 matrices. For a,b,c,d e {1,2, 3}, define 

Fa,b\c,d = E a b + E c d and G a ,b;c,d = E a b E c d- 

Note that, if a ^ c and b ^ d, then F a ^c,d and G a ,b\c,d have rank 2. Note also that 

1 1 

Eab — “ Fa,b;c,d ~b ~^G a,b\c,d' 

where we can choose a ^ c and b f d. The reader can now easily identify a set of 
rank 2 matrices that span M. 

7.8 We prove that dim(P„) = n + 1 by showing that 

{l,x,x 2 , ...,x n } 

is a basis. Clearly these vectors span P n and the fact that they are linearly independent 
follows easily from the more general argument given next, where the dimension of 
P is computed. 

We show that the dimension of P is countably infinite by showing that 

{1, x, x 2 , . . . , x n , . . .} 

is a basis. It is obvious that these vectors span P. To see that they are linearly 
independent suppose that a linear combination of the vectors results in the zero 
vector. In other words, 0 = Xit=o a kX k for some real numbers «o, But this 

linear combination is a polynomial and it is well-known that the only polynomial 
that evaluates to 0 on all real numbers is the 0 polynomial. Thus all coefficients must 
be equal to 0, and therefore the only linear combination resulting in the zero vector 
is the trivial linear combination. 
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7.9 We note first the general fact that if A : U —*■ V and B : V -» W are invertible 
functions, then (Bo A) -1 = A -l o5 -1 . This is nothing but the computation 

B{MA~\b V')))) = B(B -1 (w )) = w 

and similarly 

A -1 (B -1 (B(A(w)))) = A -1 (A(w)) = u 

for all w e W and all u e U. Now, going back to the linear operators A and B, 
assumed to commute, we have 

A~ l B~ l = (BA) -1 = (AB) -1 = B -1 A -1 , 


as required. 

7.10 Assume first that V is infinite dimensional, and let 3$ be an infinite basis of V . 
Fix a vector xo e S$ and consider the set S = 3& — {xo}- As a subset of a basis these 
vectors are linearly independent, and thus S is a basis of the subspace spanned by 
S. Clearly, xo is missing from that span (otherwise the original 38 would be linearly 
dependent, which is impossible since it is a basis), and thus S is a proper subspace of 
V. To show that S = V it suffices to show they have equal dimensions, namely that 
\3S\ = |Sj. Clearly, |Sj < \38\, since the inclusion S — »■ 38 is an injection. If we can 
construct an injection in the other direction as well, then, using the Cantor-Shroder- 
Bernstein Theorem, the cardinalities will indeed be shown to be equal. To construct 
an injection / : 38 — »■ .S', let {x m ) m >\ be a countably infinite list of distinct vectors 
in S. Define now /( x m ) = x m+ \ for all m > 1, with /(xo) = x\, and f(x) = x in 
all other cases. It is immediate to verify that / is indeed an injection. This completes 
the proof that if V is infinite dimensional, then it is isomorphic to a proper subspace 
of it. 

In the other direction, suppose that V = U for some proper subspace U . Since 
linear spaces over the same field are isomorphic if, and only if, their dimensions 
are equal, it follows that dim(F) = dim((/). But in a finite dimensional space, if a 
subspace has the same dimension as the ambient space, then the subspace coincides 
with the ambient space, and is thus not proper. We conclude that V must be infinite 
dimensional. 


7.6.2 Topological Spaces 

Remark 7.1 In topology the term neighborhood of a point x in a topological space 
X may mean one of two things. Either an open set U in X with x e U , or an arbitrary 
subset N c X which contains an open set U with x e U. The difference is only 
cosmetic, and one can easily translate between the two situations. In the solutions 
below we adhere to the first meaning of neighborhood. 
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7.11 Any indiscrete space with more than one element very clearly does not satisfy 
the To property. For an example of a space that is 7 'q but not 7j consider the Sierpinski 
space § = { 0 , 1 } . For a If space that is not Ti, let S be an infinite set endowed with the 
cofinite topology. Clearly the T\ separation property is satisfied since given distinct 
points x, y e S, the sets S — {x} and S — {>’} are open and clearly satisfy the 
requirements. We show that S is not Tj. Indeed, let x, y be two distinct points in S 
and suppose x e U and ye V for some open sets U, V. Then the complements of U 
and of V are finite subsets of .S', and thus 

s - (u n v) = (S - U) u (s - v) 

is also finite. Since S is infinite it follows that U fl V ^ 0. 

7.12 A locally (path) connected space that is not (path) connected is, for instance, 
the space (0, 1) U (3, 4) as a subspace of R. with the Euclidean topology (as is easy 
to see). As for the comb space, we first show that it is path connected, and thus also 
connected. Indeed, any point admits a path to the point (0, 0) by first traveling south 
to meet the X-axis and then traveling west. Thus any two points can be joined by a 
path in the space, so it is indeed path-connected. It is however not locally connected, 
and thus not locally path-connected. Indeed, given any point of the form p = (0, y) 
withO < y < 1 , consider a small enough circle centered at p which does not intersect 
the X-axis. Any open set containing p must contain a subset of this form, but such 
a set is not connected. 

7.13 Let X be a compact space. Given any x e X we may take U = C — X, and 
then x e U C C with U open and C compact, so X is locally compact. An example 
of a locally compact space that is not compact is, e.g., R with the Euclidean topology. 
Its non-compactness is obvious. As for it being locally compact, suppose x e R is 
given. Then x e (x — 1, x + 1) C [x — 1, x + 1], as required. 

As an example of a space that is not locally compact, consider Q as a subspace 
of R with the Euclidean topology. We will show that Q is not locally compact at any 
point. This will be done by showing that in fact no non-empty compact subset of (Q) 
contains a non-empty open set. To see that, suppose that U c C where U is open 
and C is compact, and both are non-empty. It follows that Q fl (a, b) C U for some 
non-empty open interval (a, b). Since Q is Hausdorff, the compact set C is closed, 
and thus, taking closures, we obtain that [a, b] fl Q C C. This is thus a closed subset 
of a compact set in a Flausdorff space, and thus Q fl [a, b] is compact. But that is 
not the case. Indeed, let y e (a, /;] be an irrational number and let \q n }„ e N be a 
strictly increasing sequence of rationals tending to y. Let U n = QC 1 (— oo, q n ) and 
consider also the set Q fl (y, oo). These are open sets in the subspace topology that 
cover [a, b] but (as the reader may easily verify) they admit no finite subcovering. 

7.14 We prove that X is closed. The condition 
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reads as 

a 2 + be = 0 
ab + bd = 0 
be + dr = 0 
ac + eel = 0. 

In particular, by means of the homeomorphism 0 : M 2 (R) — > E 4 , the set X can be 
identified with the set 


S = {u — (a, b , c, d) e R 4 | i/fi («) = V^Cm) = ^(k) = t/ 4 (m) = 0}, 


where 

\jr\, ^2, ^3, V f 4 : M 4 — > R 


are the functions 

ijr i(x, y, z, w) = x 2 + yz, i/f 2 (x, y, z, w) = xy + yw, 

ifiix, y, z, w) = yz + z 2 , ^(x, y, z, w) = xz + zw. 

In other words. 


^ = Ve“‘({ 0}) n tA 2 _1 ({0}) n ^({O}) n ^ x ({0}) 

and since each i /r,- is a polynomial function, and thus continuous, the set (j )~ 1 ({0}) is 
closed (since {0} is closed in R). We see that .S' is the intersection of four closed sets, 
and hence it is itself closed. 

We show now that X is not compact. Indeed, for all n e Z, consider the subset 


U„ 



a e (n — 1, n + 1) , b e R, c e R, d e R 


n x. 


Each U n is an open set in X. Furthermore, one has that 


U Un = X. 

ne Z 


But there is no finite sub-family of { U n } ne z that still covers X . Indeed, for any m e Z, 
the matrix ^ || ^ belongs only to U m . 

Finally, we show that X is connected. In fact, we prove that X is path-connected. 
To do that, take any A e X. We first notice that there exists a non-singular matrix 
M e M 2 (R) such that 
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for b e M. In other words, any matrix in X is similar to an upper-triangular matrix. 
Indeed, consider A as a matrix in M 2 (C). Since C is algebraically closed, A can be 
triangulated in M 2 (C), namely, it is similar to a matrix 



with a, b, c possibly complex numbers. The condition A 2 = 0 reads as 


a 2 — 0 
ab + be = 0 


hence the eigenvalues are a = 0 and c = 0. In particular, the eigenvalues belong to 
R. Therefore A is triangulable also in MtIR), proving the claim. 

Now, consider the path 



Note that, for any t e [0, 1], (y(t)) 2 = 0, and hence y(t) e X. Note also that 
y : [0, 1] — »• X is continuous and that 



and y( 1) = A, 


proving that, for any A e X, there exists a path connecting it to the zero matrix. This 
clearly suffices to show that any two elements in X can be connected by a path in X , 
and thus X is path-connected, as claimed. 

7.15 1 . We prove that X is 7i and hence also Tq. We have to prove that, for any pair 

of distinct points p,q e X, there exists a neighborhood U p of p not containing 
q and a neighborhood U q of q not containing p. Consider first the following 
case. Suppose that the points are p = [(x, 1)] = [(x, 2)] with x e R. — {0} 
and q = [(V, /)] with x' e M and y' e {1,2}. Take r e M such that 

0 < r < |x — x'\. Then the sets U p = |[(m, 1)] e X | \u — x\ < r| and 

U g = {[(«', /)]d I i u! — x' | < r j are open neighborhoods of p, respec- 
tively q, and it holds that p <£ U q and q ^ U p . It remains to consider just 
the case p = [(0, 1)] and q = [(0, 2)]. In this case, we consider the sets 

Up = j[(w, 1)] e X | u e mJ and Uq = |[(v, 2)] e X \ v e mJ. These are 
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clearly neighborhoods of p, respectively q, and p g U q and q g U p . The space 
X is thus 7j . 

We prove next that X is not T 2 . Consider the points p = [(0, 1)] , q = [(0, 2)] for 
which we prove that no disjoint neighborhoods exist. Indeed, any neighborhood 
of p contains an open neighborhood of the type 

U p = {[(«, l)]d | \u\ < rj 

for some r > 0, and any neighborhood of q contains an open neighborhood of 
the type U q — {[(«', 2)] eX | | u'\ < r' J for some r' > 0. We thus see that 
the intersection of any two neighborhoods of p and q is non-empty. 

2. We prove next that X is not compact. Indeed, for any « e Z — {0}, consider the 
open set 

U n = j[(x, 1)] = [(x, 2)] e X | n — 1 < x < n + lj 
and consider also the open sets 

Uq = |[(jc, l)]d | -1 < x < lj 

and 

C/q = {[(x, 2)] d | -1 < x < lj. 

The family = {C/„}„ e z-{ 0 } U {C/q, Uq } provides an open covering of X. But, 
for any m e Z, the point [(m, v)] e X, for y e {1, 2}, belongs to just one of the 
elements of the family °// . Hence there exists no finite sub-family of covering 
X. This proves that X is not compact. 

3. We prove that X is path-connected. Indeed, consider a point [(x, y)] e X with 
xel and y e {1,2}. Consider the path 

Y : [0, 1] -* x, y(t) = [((1 - t)x + t, y)]. 

Note that y : [0, 1] — »■ X is continuous and that 

y(0) = [(x,y)] 


while 


y(l) = [(l,l)] = [(l,2)]. 

Hence, any point [(x, y )] e X can be connected to the point [(1, 1)] = [(1, 2)] 
by means of a continuous path in X. 
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7.16 Note that X\ and X 2 are not compact, while X 2 and X 4 are. Hence X, X/ 
for i = 1,2 and j = 3, 4. Further, note that X[ = X 2 by means of the stereographic 
projection 


(p: X[ X 2 
(x, y ) i-> 


<p~ 1 :X 2 ^X 1 


(t 


2 x 


2 y 


(x, y, z ) 


+ x 2 + y 2 ’ 1 + x 2 + y 2 ’ 

f — , ^ 

\ 1 — z l-zj 


-1 +x 2 + y 2 \ 
1 + x 2 + y 2 / 


Finally, note that X 3 ^ X 4 . Indeed, the property that a topological space remains 
connected after removing two points is a topological invariant. But if one removes 
two points from X 4 , then the space obtained is connected while any two points one 
removes from X 3 results in a disconnected space. 

7.17 Let S be an uncountable set and endow it with the cocountable topology. In 
this topology every countable subset is closed since its complement obviously has 
countable complement. In the other direction, if F is a closed subset of S, then its 
complement S — F is open, and thus either F — S, or the complement of S — F, 
which is F, is countable. We thus identified the closed sets to be 


& = {F C S | F is countable) U {5}. 


Since a countable union of countable sets is countable, it follows that the collection 
& is stable under countable unions. To see that taking arbitrary unions may take one 
outside of the collection & it suffices to observe that a proper uncountable subset of 
S exists. Indeed, let X C S be such a set. Since 

x=Uw 

xeX 

the set X is certainly a union of closed sets. However, since X is uncountable and 
X ^ S, it follows that S is not closed. We leave it to the reader to exhibit proper 
uncountable subsets of S. 

7.18 Since A is a real symmetric matrix, it is diagonalizable. It follows that, up to a 
homeomorphism induced by a change of coordinates of R" , we can suppose that A 
is a diagonal matrix with positive entries /, 1 > 0,..., > 0, 


A = 


^1 
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and so 



' 

( X1 \ 

w = 

1 

■ I G R" | A.i.vj' + • • • + = 1 



K X „ ) 


The set W is closed and bounded, and hence compact, in R" with the Euclidean 
topology. 

7.19 We prove that D is open in X. Take any point p e D. Since D is locally 
compact, there exists a compact set K C I) in I) and an open set U in X such that 

p e U n D c K. 

Consider the set U — K C X , which is open in X since K is compact in the Hausdorff 
space X, and is thus closed. Since D is dense, if the open set U — K were non-empty, 
it would intersect D, but this contradicts with 

U flDCk. 


Hence U — K = 0, that is 


p eU c K c D. 

Hence the point p is an interior point of I). As p was arbitrary, we conclude that all 
of the points of D are interior, and thus that D is open. 

7.20 1. We have to show that Y contains the empty set and the whole space 

X, and that it is closed under finite unions and under arbitrary intersections. 
Indeed, 0 = V (l ■ Z) e Y and X = V(0 • Z) e Y . Take V(a\ ■ Z) e Y and 
V(a 2 ■ Z) e Y. Note that 

V (a\ ■ Z) U V (a 2 ■ Z) = V (lcm{a, b) ■ Z) e 'Y . 

Consider the family { V ( a n ■ Z)}„ e ^ C Y . Note that 

P| V (a„ ■ Z) = V (gcd {a n : n e Zj) e V . 
ne Z 


2. We prove that {0 • Z} C X is dense in X, that is, {0 • Z} = X. Indeed, the only 
closed set containing {0 ■ Z} is V (0 • Z) = X . In particular, X is the smallest 
closed set containing {0 ■ Z}, which is precisely the closure. 

3. We prove now that X is not Hausdorff. Indeed, note that for any a e Z the closed 
set V (a ■ Z) = { p ■ Z | p is a prime dividing a or p — a — 0} is a finite set. 
Hence, any two open sets in X intersect and hence X is not T 2 . 
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7.6.3 Metric Spaces 

7.21 Recall that in a metric space every compact set is also sequentially compact. 
Since {x„}„ e N does not admit any convergent subsequence, the space (X, d ) cannot 
be sequentially compact, and consequently is not compact. Another approach to this 
problem is to use the fact that a metric space is compact if, and only if, it is complete 
and totally bounded. But clearly X is not totally bounded (though it may be bounded 
and it may be complete). 

7.22 To show that l Q 0 is connected we actually show that it is path-connected and 
thus consider an arbitrary bounded real sequence x = (x n ) n e loc, for which it 
suffices to construct a path to the 0 vector, namely the constantly 0 sequence. To that 
end, consider the function 

Y ; [0, 1] -* loo, V(t) = ( tx n ) n . 

Notice that the codomain is indeed l a c , as is immediately seen (in fact this is nothing 
but the closure of foo under scalarproducts). Since y(l) = xandy(O) = 0,itremains 
to show that y is continuous with respect to the relevant metrics. And indeed, for all 
t, s e [0, 1], it holds that 

d (y(t), y(s)) = d (( tx n ) n , (sx„) n ) = sup|(r - s)-x„\ = \t - s|-sup|jc„| = Moo-dCs. t) 

n n 

and thus y is uniformly continuous (notice that we actually proved y is Lipschitz), 
and thus continuous, as was needed. 

To see that as not compact it suffices to construct a sequence of infinitely 
many vectors with all pair-wise distances equal to 1. This can easily be done in many 
ways. We mention here another approach. Notice that the set of all vectors of the 
form x — (jti, 0, 0, 0, . . .), with i| e I arbitrary, in other words the kernel of the 
shift mapping / : t 0 0 — > l 0 0 , is a closed subspace of which, with the induced 
metric, is isometric to R with the Euclidean metric. Since l 0 0 is Hausdorff (any 
metric space is Hausdorff), if l Q 0 were compact, then R would be compact as well 
(since any closed subspace of a compact Hausdorff space is compact). But R is of 
course not compact. 

7.23 For all n > 0 let X n = {x e R | / ^ = 0}. Since / is infinitely differentiable, 
/ (,!) is continuous, and thus the set X n , being the inverse image of the closed set {0}, 
is closed. Moreover, the condition on the function / states precisely that 

R = U x " 

n> 0 


and thus we have expressed the non-empty and complete metric space R (with the 
Euclidean metric) as a countable union of closed sets. By Baire’s Theorem at least 
one of the sets X„ contains a non-empty open set. Thus, there exists an n > 0 and an 
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open set U ^ 0 with U Q X n . But any open set in R, by definition, contains an open 
interval. Thus, there exist a < b with (a, b) C X„. In other words, f tn> (x) = 0 for 
all x e (a. b). Integrating n times in the range a to b reveals that / is a polynomial 
of degree at most n. 

7.24 Recall that in any inner product space the parallelogram law 

II * + y \\ 2 + II * — y \\ 2 — 2||^|| 2 + 2 || y || 2 

holds for all vectors x and y. We prove that there exists no inner product on 
C([0, I ], R) whose associated norm is ||-|| 0O by showing the parallelogram law fails. 
Indeed, consider, for example, the continuous functions x , y : (0, 1] — > R given by 

x(t ) = 1 and y(t) — t. 


One computes that 

II* + jlloo + II* — ylloo = max \x(t) + y{t)\ 2 + max \x{t)-y(t)\ 2 

t g[ 0, 1] rs[0,l] 

= max II + t\ 2 + max II — t\ 2 = 4+1 = 5 
re[0,l] re[0, 1] 


while 


2 II* II ^0 + 2 || v || ^ = 2 max \x(t)\ 2 + 2 max \y(t)\ 2 
rs[0,l] re[0,l] 

= 2 max 1 1| 2 + 2 max |?| 2 = 2 • 1 + 2 • 1 = 4. 
re[0, 1] re[0, 1] 

Therefore, the norm || • || does not satisfy the parallelogram law and is thus not induced 
by any inner product. 

7.25 A metric on the empty set is a function d : 0 x 0 — > R + satisfying the metric 
axioms. The domain is the empty set, and thus there is a unique such function, which 
is vacuously satisfying the metric axioms. As for a metric on a singleton set X = { /;), 
that is a function X x X —> R + , notice that the axioms require that dip, p) = 0, 
and thus d is already determined. It is trivially seen that this d indeed determines a 
metric on X. Finally, if X contains at least two distinct points p an q, then noting 
first that there always exists a metric structure on d, for instance d(x,y) = 1 for 
all x yt y and d(x, x) = 0 for all x, the existence of infinitely many non-isometric 
metric structures follows by noting that if d is any metric function and a > 0 an 
arbitrary positive real number, then defining 


ad : X x X — > R+ 

by (ad)(x, y) = ad(x, y) is also a metric on X. Clearly the spaces ( X , d) and 
(A, ad) are isometric if, and only if, a = 1 or if c/(x, y) = 0 for all points. However, 
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since p ^ g we have that c/(/j, q) ^ 0. We thus obtain infinitely many non-isometric 
metrics on X. 

7.26 Suppose that x e X satisfies d(x, S) = 0 and let U be an arbitrary open set with 
x e U . It then holds that B, (x ) C U for some e > 0. Since d(x. S) = 0 it follows 
that there is an element s e S with d(x, s ) < s. In particular then .y e B s (x) C U. 
We thus showed that every open set containing x intersects S, and thus x e S. 

Conversely, if x ^ S, then (noting that S is closed, and thus its complement is 
open) there exists an e > 0 with BJx) C X — S. But then d(x, s) > e for all s e S, 
and thus d(x, S ) > e. 

7.27 We establish the metric space axioms forJ. Clearly, d (x , y) > 0 for all x, y e 
X. If d(x, y) = 0, then d(x, y) — 0 and thus x = y. For all x, y e X if d(x, y) < 1, 
then 


d{x, y) = d(x, y) = d(y, x) = d(y , x) 

while if d(x, y) > 1, then d(y, x) > 1 too, and thus d(x, y) = 1 = d(y, x). Finally, 
the verification of the triangle inequality follows similarly by case splitting. 

To show that d and d induce the same topology, it suffices to notice that an open 
ball B e (x) of radius e < 1 is the same set when computed in (X, d) as it is when 
computing in ( X , d). 

7.28 We prove that there exists no isometry from R 2 to R when endowed with the 
Euclidean metrics <fjj 2 and cfjj respectively. Indeed, suppose that such an isometry 
/ : R 2 —>■ R exists. Take P, Q, R e R 2 such that 


. Q ) = d^i(P , R ) = <%2(<2, R) = 1. 


Denote by p — f{P), q = f (Q), and r — f ( R ) the images in R. of the three chosen 
points. Since / is an isometry, it follows that 

\p~q\ = d R (p, q) = d R (f(P), f(Q )) = d R 2 (P, Q ) = 1, 

and analogously that \p — r\ = 1 and \q — r\ = 1. However, elementary algebra 
reveals this to be an impossibility. 

7.29 We show that d s satisfies the metric axioms, remembering that each d-, does. 
Firstly, for all x e X 


d s (x, x ) = sup £/,•(*, x) — sup{0} = 0. 
iel iel 


Next, for all x, y e X 

d s (x, y) = supK (x, y)} = sup{<i ( (y, x)} = d s (y, x). 
iel iel 
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Finally, for all x, y, z e A, we need to show that 


supJt/iO, z)} < sup{dj(x, y)} + sup {d;(y, z)} 
iel iel iel 

for which it suffices to show, for each iel, that 


di(x, z ) < sup {di(x, y)} + sup{d,-(y, z)} 
iel iel 

And indeed, given iel, using the triangle inequality for dj 


di(x, z ) < df(x, y ) + dj(y, z) < sup {d t {x, y)} + sup {di(y, z)}, 

iel iel 


completing the proof. 

For an example showing that the infimum of metric functions need not be a metric 
function, consider a set A = {a, b, c} with three distinct points. On it consider the 
following assignments of distances 


d\(a,b)=l d 2 (a,b) — 2 
d\ (b, c) = 2 d 2 (b,c) = 1 
d i (a , c) = 3 dj (a , c) = 3 

which are easily seen to endow X with two structures of metric space. However, when 
computing the minima of the distances one obtains the function e with e(a, b) = 
e(b, c) = 1, and e(a, c) = 3. Therefore e(a, c) > e(a, b) + e(b, c ) and so the 
triangle inequality fails. 

7.30 1. No, not continuity nor uniform continuity suffices. For instance, let A = M 

with the Euclidean topology and consider / : 1R — ► R. with f(x) = arctan(.r), 
which is uniformly continuous, and thus also continuous. Then M is complete 
but the set f{x) = (— jr/2, tt/ 2) is not complete. 

2. Uniform continuity is a sufficient condition. Indeed, to show that f(x ) is totally 
bounded, let e > 0 be given and we will find a covering of f(x) by sets 
whose diameters do not exceed s. Let S > 0 correspond to the given s, that is 
diam(/(.S - )) < e whenever X C A satisfies diam(.S') < S. Since A is totally 
bounded, one can cover A by sets S\, . . . , S m whose diameters do not exceed 5. 
Each of the images /(Si), . . . , /(S m ) has diameter not exceeding e, and they 
cover / (x ) , as needed. Finally, we note that continuity alone does not suffice. 
For instance, consider the function tan : (—jt/2, tt/2) —>■ R, with R endowed 
with the Euclidean metric and the interval with the induced metric. The interval 
is totally bounded but its image, R, is not even bounded. 

3. Continuity, and thus also uniform continuity, suffices. Indeed, in a metric space 
a set is compact if, and only if, it is complete and totally bounded. Moreover, 
for a continuous function /, the image /(C) is compact whenever C is. 
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7.6.4 Normed Spaces and Banach Spaces 

7.31 First we show that the metric d(x, y) = ||jc — y | induced by the norm of a 
normed space V is translation invariant and scale homogeneous. Indeed, 

d(x + z, y + z) — HO + z) — (y + z)ll = 10 - zll = d(x, z ) 

holds for all x, y, z e V and for any scalar a 

d(ax, ay) = || ax — av|| = ||aO — y)|| = |a||0 — y|| = \a\d{x, y). 

In the other direction, suppose that V is endowed with a translation invariant and 
scale homogeneous metric d. We then define, for all x e V 

1011 = <5? (0, x) 

and proceed to show that the axioms for a normed space are satisfied. It is clear that 
1011 > 0 for all x € V and that if |0ll = 0, then d( 0, x) = 0, and thus x = 0. For 
any scalar a and x e V one has 

]|ckjc || = d( 0, ax) — d(a ■ 0, ax) = |a|c/(0, x) = |a| |0 II • 

Finally, for all x , y e V we have, by translation invariance, that JO- *+}') = d( 0, v). 

It then follows that 

10 + y II = J(0, x + y) < d(0, x) + d{x, x + y) = d( 0 , x) +d( 0 , y) = 1011 + 10 li- 
as required. 

7.32 1. We establish the norm axioms for the given proposed norm function. 
Clearly ||/|| >0 since the integral of a non-negative function is non-negative. 
Moreover, suppose that ||/|| =0 for a continuous function / : [a, b] — > 1R. 
If / jL 0, then /Oo) ^ 0 for some vo € [a.b]. Since I/O) I is continuous 
and | y Oo) I > 0 it follows that there is an interval (c, d) C [a, b] upon which 
|/0)l attains values greater than |/Oo)l/2 > 0. It follows that 

b d 

J \f(x)\dx > J \f(x)\dx >(d- c) | / Oo) 1/2 > 0 

a c 

which is a contradiction. We conclude thus that ||/|| =0 implies / = 0. Next, 
for all X e R 
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b b 

|| 1 - /|| = J \k- f(x)\dx = k- J \f(x)\dx = k-\\f\\. 

a a 

Finally, for / : [a , b] —*■ M and g : [a ,&]—»■ M, it holds that 

b b b 


II f + g 


\f(x) + g(x)\dx < 


\f(x)\dx + 


|g(x)| dx = ll/ll + ||g|| , 


a 


a 


a 


establishing the triangle inequality. 

2. We prove that the metric space ( X , d) with d(f,g) = || / — g || is not complete. 
For simplicity, assume that [a, b] = [ — 1 , 1 ] . We provide an example of a Cauchy 
sequence in X having no limit point in X. For all n e N, consider the function 


/„: [0, 1]^R, 


fn (X ) 


1 for — 1 < x: < — j; 
nx for —\<x<- 
1 for l < x < 1 . 


Note that f,(x) e X for all nel. Furthermore, for n, m e N with n < m one 
computes 


II fn - /mil = J I fn(x) - f m (x)\ dx 
-1 

/ 1 1 \ 

= 2 I (m — n) x dx + / (1 — nx) dx 


V 




<2 / (1 — nx) dx = 

J n 


It thus follows that the sequence (/«) ne p>j is a Cauchy sequence in (X, d). 

We prove now that (/i )„ G f.j has no limit point in X. Indeed, we claim that, if 
such a limit point / e X were to exist, then it would be that 


f(x) = -1 for all* e [-1,0) and f(x) = 1 for all jc e (0, 1], 


which is absurd since such a function / is not continuous on [— 1 , 1 ] . To prove the 
claim, suppose that there were a point x e [—1,0) such that | f(x) — (—1)1 = 8 
for some 8 > 0. For simplicity, we assume x / — 1 : otherwise, by continuity, if 
x = — 1 , then there exists a point x' > — 1 such that | f(x') — (— 1)| = 8 ' > 0. 
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Then, by continuity, there would be a neighbourhood (x — e, x + e) C [—1,0), 
where e > 0, for x in [—1, 0) such that | /(y) — ( — 1) | > 8/2 for all y e 
(x — s, x + s). Choose N > — l/(x + e), so x + s < —l/N. Hence, for all 
n > N, it holds that 

1 x+e 

\\fn-f\\ = J \fnW- f(x)\dx > J \f n (x)- f(x)\dx > 2 £ 8 > 0, 

-1 x-e 

which is not possible. 

7.33 We first prove that d is a distance on R. Firstly, it is clear that for all x, y e M 

0 < d(x, y) — d(y, x). 

Furthermore, if d(x, y) = 0 then \x — y| = 0, and thus x = y. For the triangle 
inequality, consider the function / : [0, oo)] — > [0, oo] given by 


and notice that the proposed distance function is 


d(x, y) = f (d E (x , y)) 

where d £ (x , y) = |x — y| is the Euclidean metric on M. Using elementary analysis 
it is seen that / is monotonically increasing and that it is concave and thus it is also 
subadditive, i.e., 


/0 + t) < f(s ) + fit). 

We may now argue as follows. For all x, y, z e M 

d(x , z) = f(d E (x, z )) < f(d E (x, y) + d E (y, z)) 
where we used the monotonicity of /. The subadditivity of / now implies that 


f(d E (x, y) + d E (y, z)) < f(d E (x, y)) + f{d E {y, z)) = d(x, y) + d{y, z) 

establishing the triangle inequality, and thus that (R, d) is a metric space. 

We now show that d is not induced by any norm. Indeed, if there were such a 
norm x i-^ ||x|| for which d(x, y) = ||x — y||, then, for all x, y e R and all X e R, 
it would hold that 
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d{Xx,Xy) = || A,* — Ay|| = |A| ■ ||x — y|| = |A|-cf(x,y). 

However, by using the explicit expression of d, we get 

|Ax — Ay | \x — y \ 

1 + |Ax — Ay | 1 + \x — y| 

and taking, e.g., x = 1, y = 0 , and A = 2, the left-hand-side is 2/3 while the 
right-hand-side is 1 /2, yielding an absurdity. 

7.34 To show that the topology induced by || • H 2 is finer than the topology induced 
by || ■ || 1 it suffices to show that the identity function 

id:(X, || -Ik) -»•(*, IMIi) 

is continuous. And indeed, given s > 0 let 8 = e/K. Then, for all x 1 , X 2 e X such 
that ||xi — X 2 II 2 < <5, one has ||id(xi) — id(* 2 )lli = 11*1 — JC 2 II 1 < K ■ ||xi — X 2 II 2 < 
KS = s. 

7.35 1. Given x, y e V with ||x|| = ||y|| = 1, for every 0 < a < 1 the triangle 
inequality gives 

II ax + (1 - or)y|| < ||ax|| + ||(1 - a)y|| = ot||x|| + (1 - or) ||y || = 1. 

2. Suppose now that ||x|| 2 = {x,x) for an inner product (•, •) on V , and let’s 
assume the ground field is R (the case K = C is left as an extra exercise). If 
r/)i satisfy ||x|| = ||y|| = 1, then for all 0 < a < 1 

||ax + (1 — a)y|| 2 = a 2 + (1 — a) 2 + 2a(l — a)(x, y). 

By the Cauchy-Schwarz Inequality we have that 

\{x,y)\ < l|x||||y|| = 1. 

Since x ^ y and ||x|| = II y II, it follows that x is not a scalar multiple of y, 
and thus the Cauchy-Schwarz Inequality holds strictly. Thus (x, y) < 1 and it 
follows that 

||ax + (1 — a)y|| 2 < 1 + 2a 2 — 2a + 2a — 2a 2 = 1 

showing that ||ax + (l— a)y|| < 1 and thus the only points of intersection of 
L(x, y) with the unit sphere occur when a = 0 or u = 1 at the points y and x, 
respectively. 

3. Assume that x y satisfy ||x|| = ||y|| = 1 in a strictly convex space. By the 
triangle inequality ||x + y|| < ||x|| + ||y|| < 2 so we just need to show that 
||x + y || = 2 is impossible. Indeed, if ||x + y|| = 2, then 
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||(l/2)x + (1/2)3,|| = (1/2)||* + y || = 1 
contradicting strict convexity. 

4. R 2 with the £2 norm is strictly convex since the £2 norm is induced by an inner 
product (the standard inner product). That R 2 with the i\ norm is not strictly 
convex is seen by choosing * = (1, 0) and y = (0, 1). Then ||x||i = ||y||i = 1 
but ||* + y || = 2. Similarly, for * = (1, 1) and y = (1, —1) it holds that 
II* II 00 = || y || 00 = 1, but ||* + 3, || 00 = 2, showing that the £ a 0 norm is not 
strictly convex either. 

7.36 To show that the kernel of A is a closed linear subspace of V recall that the 
kernel, by definition, is the set A -1 ({0}). The fact that the kernel is a linear subspace 
of the domain was established in the main text and in any case is easy to re-establish 
if needed. To show that the kernel is closed recall that the inverse image under 
a continuous function of a closed set is closed. Since a bounded linear operator is 
continuous, it suffices to show that {0} is a closed set in W. Indeed, the metric induced 
on W by the norm is a metrizable topology and thus is Hausdorff, and thus every 
singleton set is closed. 

7.37 The verification that A is a linear operator is trivial, and thus we omit it. Next 
we observe that for all * e £2 with * ^ 0 

OO . OO 

n Ax ii 2 = — ) x »i 2 < 21 1**] 2 = iwi 2 


and thus 


||Ax|| < ||* 


holds for all x 0. It follows that || A|| < 1 and that if || A || = 1, then the norm is 
never attained. To see that the operator norm of A is indeed precisely 1 , consider 

e n = (0,0, ...,0, 1,0,0,...) 


with 1 in the n- th position. It holds that 

l|Ae„|| = 1- - 

n 

and since ||e„|| = 1 it follows that ||A|| > ||Ae„||, for all n > 1, and so we conclude 
that indeed || A|| = 1. 

7.38 The composition of linear operators is always a linear operator, so all we need 
to do is show the composition BA is bounded, which will be done by showing the 
requested upper bound of the operator norm. Indeed, for all * e U 

\\BAx\\ = P(Ax)|| < PH || Ax || < P||||A|| ||*|| 
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and the claim follows. 

7.39 In an exercise the reader was requested to prove that for bounded linear opera- 
tors A, B e B(33), if A is invertible and || A — B || < || A||, then B is invertible. With 
this result in mind one just needs to correctly interpret the meaning of the claim that 
G is open. Let A e G be given. To show that G is open it suffices to show that G 
contains an open ball with positive radius and centre A. In other words, we need to 
find e > 0 such that every B e B(23) with || A — B || < e is invertible. By the above 
result, one may take s = ||A||, which is positive (since A is invertible, so clearly 
A# 0). 

7.40 Notice that the domain of ST is the set {x e 2% \ \ Tx e Given a 

sequence {x n } in the domain of ST, suppose that x n — > xq and that ST x n -» z. We 
need to show that v'o is in the domain of ST and that ST xq = z. Indeed, since T is 
continuous, it follows that 


Tx n -»• Tx 0 - 

Since Tx n is in the domain of S and since S(Tx n ) -» z, it follows from the fact that 
S is closed that T xq is in the domain of S and that STxq = z. Thus, as was required, 
we see that xq is in the domain of S and that STxq = z, showing ST is closed. 


7.6.5 Topological Groups 

7.41 Firstly, note that R" with the Euclidean topology is a Hausdorff topological 
space since the topology is merizable. Both the group sum 

+ : R" x R" -* R", ((fli, . . . , a n ) , {b \, . . . , b n )) ri {a\ + b \, . . . , a„ + b„) 
and the group inverse 

— : R" — > R", («i, . . . , a n ) i-^ (— a \, . . . , —a n ) 

are continuous with respect to the Euclidean topology on R" and to the product 
topology on R" x R" . Hence, the additive group R" with the Euclidean topology is 
a topological group. In fact, the map 

R x R" — ► R", (k, (oi, . . . , a,,)) i->- (X ■ a \, . . . , X ■ a n ) 

is also continuous with respect to the Euclidean topology of R" and the product 
topology of R x R" . That is to say, the R-vector space R" is actually a topological 
vector space. 
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7.42 Note that SL„(M) = det -1 ({1}), and det : GL„(K) — >■ R is continuous. Hence 
SL„ (R) is a closed subgroup of the topological group GL„ (R). In particular, SL„ (R) 
is a topological group. 

7.43 Note that O (n) — <p ~ 1 ({id„ }), where : GL„ (R) — ► GL„ (R) is given by 

cp{A) = A 1 A, 

and is a continuous map. Hence 0(ri) is a closed subgroup of the topological group 
GL„(R). In particular, O(n) is a topological group. 

7.44 Note that SO («) = det -1 ({1}), where det: O(n) — > {1, —1} is continuous. 
Hence SO (n) is a closed subgroup of the topological group O(n) and is thus a 
topological group. 

7.45 Note that Sp 2n (M) = (p~ l ({7q})> where <p: GL 2 „(R) — > GL 2 „(R), given by 


<p(A) — A { ■ Jo ■ A, 


is a continuous map. Hence Sp 2 „ (IK.) is a closed subgroup of the topological group 
GL 2 ,j(R), and thus is itself a topological group. 

7.46 For a given B e SO (n), we construct a path 

y : [0, 1] — > SO (n) such that y(0) = id„ and y( 1) = B. 

The matrix B being orthogonal, there exists an orthogonal matrix M such that 

B = M { KM 


where 



with q = 2r and, for k = 

n- (p + q). 



/ cosai — sinai 
sinai cosai 


\ 

S = 







cosa/c — sina^ 
sin otk cos otk / 


for some a\, . . . , a.k e (0, n) U (jr, 2 tt). For t e [0, 1], define 
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where 


/ id p 


\ 


Rt 


V 


sj 


R t = 


/ 

V 


cos(/7r) — sin(f7r) 
sin(f7r) cos(?7r) 


cos (tic) — sin(fjr) 
sin(fjr) cos(r^) ) 



{ cos(rai) — sin(tai) 
sin(rai) cos(fcei) 


\ 

s, = 





l 


cos (tak) — sin(foi,0 
sin(tai) cos (tak) ) 


Hence define, for all t e [0, 1], 

y(t) = M l K,M e SO(n). 

Since y : [0, 1] SO(n) connects id,, to B, we get that SO(n) is path-connected 
and hence also connected. 

7.47 Note that O (n) is the disjoint union of 0 + (n) and O ~(n), where 
0+(«) = {A e O («) | det(A) > 0} 


and 

0~(n) = [A e O(n) | det(A) < 0}. 
Furthermore, 0 + (n) and O ~(n) are homeomorphic. In fact, the map 



(x 11 

X\2 ‘ 

■ X\„ \ 

( -XU 

*12 * 

■ *i«\ 

Or + (« ) - 

» Or-(n), : 



- ; 




\Xnl 

Xn2 

Xnn ) 

\-*nl 

*h2 

Xnn j 


provides a homeomorphism. Therefore, it suffices to show that O + (n) is path- 
connected. But O + (n) = SO (n), and the claim follows since it was already 
established that SO (n) is path-connected. 

7.48 For a given B e GL+ (R), we construct a path 
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y: [0, 1] — »■ GLjj" (R) such that y(0) = id,, and y( 1) = B. 

By the polar decomposition theorem there are a symmetric positive-definite matrix 
L and an orthogonal matrix P such that B — LP. Since det(L) > 0 and P is 
orthogonal, it follows that dct( P) = 1, and hence P e O + (n) = SO (n). The matrix 
L being symmetric, there exists an orthogonal matrix C such that 

L = C { AC where A 

f or all t e [0, 1] define 

/ tX i + (1 — f) 

A, = 

\ tX n + (1 — t) J 

Since the topological group SO («) is path-connected, there exists a path 
/x: [0, 1] — >■ SO {n) such that /x(0) = id,, and /x(l) = P. 

Hence define, for all t e [0, 1], 



y(t) = C l A t C/JL{t). 

Since y: [0, 1] — ► GL+(R) connects id,, to B, we get that GL+ (R) is path-connected 
and hence also connected. 

7.49 Note that since the function det: M„(R) —> R is continuous and surjective, 
the sets 

GL+(R) = def 1 ((0, +oo)) 


and 


GL“(R) = def 1 ((— oo, 0)) 
are non-empty open sets and 


GL+(R) O GL“(R) = 0 


while 


GL„(R) = GL+(M) U GL“(R). 
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Note also that GL+ (R) and GL n (R) are homeomorphic, in fact, the map 



/ *11 

*12 ' 

' *1 n \ 


( -XU 

X\ 2 * 

• *i« \ 

GL+(R) GL,7(R), 

\*Hl 

*«2 ‘ 

A nn / 

H 

^ *nl 

Xn 2 * 

A nn / 


provides a homeomorphism. The claim now follows since it was already proved that 
GL+ (R) is path-connected, and hence GL“(R) is path-connected as well. 

7.50 For a given B e SL„ (R) we construct a path 

y : [0, 1] — > SL„(R) such that y(0) = id,, and y( 1) = B. 

By the polar decomposition theorem there exist a symmetric positive-definite matrix 
L and an orthogonal matrix P such that B = LP .In particular, since det(B) = 1, 
it holds that dct( P) = 1 and so det(L) = 1. The matrix L being symmetric, there 
exists an orthogonal matrix C such that 

L = C l AC 

and since L is positive-definite with determinant equal to 1, the diagonal matrix A 
has the form 

( expOi) \ 

exp (/r») 

for some m , . . . , n„ e R such that X/=t Pj = 0- The matrix P being orthogonal, 
there exists an orthogonal matrix M such that 

P = M l KM 


where 



with q = 2r and, for k = n — (p + q ), 
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/ cosai — sinai 
sinai cosai 


\ 

5 = 







cosctk — sina& 
sin ock cos ctk / 


for some a \, . . . , a* e (0, n) U ( jt , 2 tt). For all t e [0, 1] define 


and define 


where 


'exp(f/r.i) 


A, = 


exp (tfi n ) 



R, = 


/ cos (tic) — sin(fjr) 
sin(?7r) cos(?7r) 


V 


cos(f7r) — sin(f7r) 
sin(fjr) cos(r^) j 


and 



/ cos(tai) — sin(tai) 
sin(ro'i) cos(tai) 


\ 

St = 







cos(fa/t) — sin (fat) 
sinl/a*) cos(ta^) y 


Hence, define for all t e [0, 1] 

y(t) = C l A,CM { K t M e SL„(R). 

Since y : [0, 1] -» SL„(R) connects id,, to B, we get that SL„(R) is path-connected 
and hence connected. 
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